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Preface 



The Asian Computing Science Conference (ASIAN) series was initiated in 1995 
to provide a forum for researchers in computer science from the Asian region 
to meet and to promote interaction with researchers from other regions. The 
previous four conferences were held, respectively, in Bangkok, Singapore, Kath- 
mandu, and Manila. The proceedings were published in the Lecture Notes in 
Computer Science Series of Springer-Verlag. 

This year’s conference (ASIAN’99) attracted 114 submissions from which 28 
papers were selected through an electronic PC meeting. In addition, 11 papers 
were selected for shorter presentations at the poster sessions. 

The themes for this year’s conference were announced to be: 

— Embedded and Real-Time Systems 

— Formal Reasoning and Verification 

— Distributed and Mobile Computing 

The key note speaker for ASIAN’99 is Amir Pnueli (Weizmann Institute, 
Israel) and the invited speakers are Nicolas Halbwachs (VERIMAG, CNRS, 
France) and Krishna Palem (The Georgia Institute of Technology and Courant 
Institute, New York University, USA). We thank them for accepting our invita- 
tion. 

This year’s conference is being sponsored by the Asian Institute of Technology 
(Thailand), INRIA (France), the National University of Singapore (Singapore), 
and UNU/IIST (Macau). We thank all these institutions for their continued sup- 
port of the ASIAN series. 

This year’s conference will be held in Phuket, Thailand. We are much obliged 
to the Prince of Songkhla University for providing the conference venue and to 
Rattana Wetprasit for making the local arrangements. 

We also wish to thank the PC members and the large number of referees for 
the substantial work put in by them in assessing the submitted papers. 

Finally, it is a pleasure to acknowledge the friendly and efficient support pro- 
vided by Alfred Hofmann and his team at Springer-Verlag in bringing out this 
volume. 
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Validation of Synchronous Reactive Systems: 
From Formal Verification to Automatic Testing* 



Nicolas Halbwachs and Pascal Raymond 

Verimag'^'^, Grenoble - France 
{Nicolas . Halbwachs , Pascal . Raymond} (9 imag . f r 



Abstract. This paper surveys the techniques and tools developped for 
the validation of reactive systems described in the synchronous data-flow 
language Lustre [HCRP91]. These techniques are based on the specifica- 
tion of safety properties, by means of synchronous observers. The model- 
checker Lesar [RHR91] takes a Lustre program, and two observers — 
respectively describing the expected properties of the program, and the 
assumptions about the system environment under which these properties 
are intended to hold — , and performs the verification on a finite state 
(Boolean) abstraction of the system. Recent work concerns extensions 
towards simple numerical aspects, which are ignored in the basic tool. 
Provided with the same kind of observers, the tool Lurette [RWNH98] 
is able to automatically generate test sequences satisfying the environ- 
ment assumptions, and to run the test while checking the satisfaction of 
the specified properties. 



1 Introduction 

Synchronous languages [Hal93, BG92, LGLL91, HCRP91] have been proposed 
to design so-called “reactive systems” , which are systems that maintain a per- 
manent interaction with a physical environment. In this area, system reliability, 
and therefore design validation, are particularly important goals, since most 
reactive systems are safety critical. As a consequence, many validation tools 
have been proposed, which are dedicated to deal with systems described by 
means of synchronous languages. These tools either concern automatic verifi- 
cation [LDBL93, DR94, JPV95, Bou98, RHR91], formal proof [BCDP99], or 
program testing [BORZ98, RWNH98, MHMM95, Mar98]. 

As a matter of fact the validation of synchronous programs, on one hand 
raises specific problems — like taking into account known properties of the envi- 
ronment — and on the other hand allows the application of specific techniques 
— since the programs to be validated are deterministic systems with inputs, 
in contrast with classical concurrent processes, which are generally modelled 
as non-deterministic and closed systems. Both for formal verification and for 
testing, the user has to specify: 

This work was partially supported by the ESPRIT-LTR project “SYRF” . 

Verimag is a joint laboratory of Universite Joseph Fourier, GNRS and INPG associ- 
ated with IMAG. 

P.S. Thiagarajan, R. Yap (Eds.): ASIAN’99, LNCS 1742, pp. 1-12, 1999. 

(c) Springer- Verlag Berlin Heidelberg 1999 
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1. the intended behavior of the program under validation, which may be more 
or less precisely defined. In particular, it may consist of a set of properties, 
and, for the kind of considered systems, critical properties are most of the 
time safety properties. 

2. the assumptions about the environment under which the properties specified 
in (1) are intended to hold. These assumptions are generally safety proper- 
ties, too. 

In synchronous programming, a convenient way of specifying such safety prop- 
erties is to use ^^synchronous observers’’^ [HLR93], which are programs observing 
the inputs and the outputs of the program under validation, and detect the 
violation of the property. Once these observers have been written, automatic 
validation tools can use them for 

formal verification: One can verify, by model-checking, that for each input 
flow satisfying the assumption, the corresponding output flow satisfy the 
property. In general, this verification is performed on a finite-state abstrac- 
tion of the program under verification. 

automatic testing: The assumption observer is used to generate realistic test 
sequences, which are provided to the program; the property observer is used 
as an “oracle” determining whether each test sequence “passes” or “fails” . 
In this paper, we present these approaches in the context of the declarative 
language Lustre [HCRP91] . A model-checker for Lustre, called Lesar [RHR91] , 
has been developped for long, and extended towards dealing with simple numer- 
ical properties. Two testing tools, Lutess [BORZ98] and Lurette [RWNH98] 
are also available; here, we focus on Lurette, which has some numerical capa- 
bilities. 

2 Synchronous Observers in LUSTRE 

2.1 Overview of Lustre 

Let us first recall, in a simplified way, the principles of the language Lustre: 
A Lustre program operates on flows of values. Any variable (or expression) 
X represents a flow, i.e., an infinite sequence • • •) of values. A 

program is intended to have a cyclic behavior, and Xn is the value of x at the 
nth cycle of the execution. A program computes output flows from input flows. 
Output (and possibly local) flows are defined by means of equations (in the 
mathematical sense), an equation “x=e” meaning “Vn, Xn = Cn \ So, an equation 
can be understood as a temporal invariant. Lustre operators operate globally on 
flows: for instance, “x-hy” is the flow (tq+^o, • • • , • • •)• addition 

to usual arithmetic. Boolean, conditional operators — extended pointwise to 
flows as just shown — we will consider only two temporal operators: 

— the operator “pre” flflrevious’’^) gives access to the previous value of its argu- 
ment: “pre(x)” is the flow (m/, xq, • • • , • • •), where the very first value 

“m/” is an undefined (“non initialized”) value. 
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— the operator (“followed by”) is used to define initial values: “x -> y” is 
the flow (xq, ^1, . . . , yn^ • • initially equal to x, and then equal to y forever. 

As a very simple example, the program shown below is a counter of “events” : 

It takes as inputs two Boolean flows 
“evt” (true whenever the counted 
“event” occurs), and “reset” (true 
whenever the counter should be 
reinitialized), and returns the num- 
ber of occurrences of “events” since 
the last “reset” . Once declared, 
such a “node” can be used any- 
where in a program, as a user- 
defined operator. For instance, our 
counter can be used to generate an 
event “minute” every 60 “second”, 
by counting “second” modulo 60. 

2.2 Synchronous Observers 

Now, an observer in Lustre will be a node taking as inputs all the flows relevant 
to the safety property to be specified, and computing a single Boolean flow, say 
“ok” , which is true as long as the observed flows satisfy the property. 

For instance, let us write an ob- 
server checking that each occur- 
rence of an event “danger” is fol- 
lowed by an “alarm” before the next 
occurrence of the event “deadline”. 

It uses a local variable “wait”, trig- 
gered by “danger” and reset by 
“alarm”, and the property will be 
violated whenever “deadline” occurs 
when “wait” is on. 

Assume that the above property is intended to hold about a system S, 
computing “danger” and “alarm”, while “deadline” comes from the environment. 
Obviously, except if S emits 
“alarm” simultaneously with each 
“danger”, it cannot fulfill the prop- 
erty without any knowledge about 
“deadline”. Now, assume we know 
that “deadline” never occurs earlier 
than two cycles after “danger”. 

This assumption can also be expressed 



node Assumption(danger, deadline: bool) 
returns (ok: bool); 
let ok = not deadline or 

(true -> pre(not danger and 
(true -> pre(not danger)))); 
tel 



by an observer. 



node Property(danger, alarm, deadline: bool) 
returns (ok: bool); 
var wait: bool; 
let 

wait = if alarm then false 

else if danger then true 
else (false -> pre(wait)); 
ok = not(deadline and wait); 
tel 



node Count(evt, reset: bool) 
returns(count: int); 
let 

count = if (true -> reset) then 0 

else if evt then pre(count)+l 
else pre(count) 



mod60 = Count(second, pre(mod60=59)); 
minute = (mod60 = 0); 







4 



Nicolas Halbwachs and Pascal Raymond 



Assumption 




Fig. 1. Validation Program 



2.3 Validation Program 

Now we are left with 3 programs: the program S under validation, and its two 
observers, Property and Assumption. We can compose them in parallel, in a 
surrounding program called “Validation Program” (see Fig.l). Our verification 
problem comes down to showing that, whatever be the inputs to the validation 
program, either the output “correct” is always true, or the output “realistic” is 
sometimes false. The advantages of using synchronous observers for specification 
have been pointed out: 

— there is no need to learn and use a different language for specifying than for 
programming. 

— observers are executable] one can test them to get convinced that the specified 
properties are the desired ones. 

Notice that synchronous observers are just a special case of the general tech- 
nique [VW86] consisting in describing the negation of the property by an automa- 
ton (generally, a Biichi automaton), and showing, by performing a synchronous 
product of this automaton and the program, that no trace of the program is 
accepted by the automaton. The point is that, in synchronous languages, the 
synchronous product is the normal parallel composition, so this technique can 
be applied within the programming language. 

3 Mo del- Checking 

3.1 Lustre Programs as State Machines 

Of course, a Lustre program can be viewed as a transition system. All operators, 
except pre and ->, are purely combinational, i.e., don’t use the notion of state. 
The result of a -> operator depends on whether the execution is in its first 
cycle or not: let init be an auxiliary Boolean state variable, which is initially 
true, and then always false. The result of a pre operator is the value previously 
taken by its argument, so each pre operator has an associated state variable. All 
these state variables define the state of the program. Of course, programs that 
have only Boolean variables have finitely many states and can be fully verified 
by mo del- checking [QS82, CES86, BCM+90, CBM89]: when the program under 
verification and both of its observers are purely Boolean, one can traverse the 
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finite set of states of the validation program. Only states reached from the initial 
state without falsifying the output “realistic” are considered, and in each reached 
state, one check that, for each input, either “realistic” is false, or “correct” is true. 
This can be done either enumeratively (i.e., considering each state in turn) or 
symbolically, by considering sets of states as Boolean formulas. 

3.2 Lustre Programs as Interpreted Automata 

Programs with numerical variables can be partially verified, using a similar ap- 
proach. We consider such a program as an intepreted automaton: the states 
of the automaton are defined by the values of the Boolean state variables, as 
above. The associated interpretation deals with the numerical part: conditions 
and actions on numerical variables are associated with the transitions of the 
automaton. An example of such an interpreted automaton will be shown in Sec- 
tion 4. If it happens that a property can be proved on the (finite) control part 
of the automaton, then it is satisfied by the complete program. Otherwise, the 
result is unconclusive. 



3.3 LESAR 

Lesar is a verification tool dedicated to Lustre programs. It performs the 
kind of verification described above, by traversing the set of control states of 
a validation program, either enumeratively of symbolically. More precisely, it 
restricts its search to the part of the program that can influence the satisfaction 
of the property. This part, sometimes called the cone of influence^ can be easily 
determined, because of the declarative nature of the language: all dependences 
between variables are explicit. This is an important feature, since experience 
shows that, in many practical cases, the addressed property only concerns a 
very small part of a program: in such a case, Lesar may be able to verify the 
property, even if the whole state space of the program could not be built. 

4 Towards Numerical Properties 

Only properties that depend only on the control part of the program can be 
verified by model checking. The reason is that Lesar can consider as reachable 
some control states that are in fact unreachable because of the numerical in- 
terpretation, which is ignored during the state space traversal: some transitions 
are considered feasible, while being forbidden by their numerical guards. Let us 
illustrate this phenomenon on a very simple example, extracted from a subway 
speed regulation system: 

A train detects beacons placed along the track, and receives a signal broad- 
cast each second by a central clock. Ideally, it should encounter one beacon each 
second, but, to avoid shaking, the regulation system applies a hysteresis as fol- 
lows: let #b and #s be, respectively, the current numbers of encountered beacons 
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Fig. 2. Interpreted automaton of the subway example 



and of elapsed seconds. Whenever #b — #s becomes greater 10, the train is con- 
sidered early, until #b — #s becomes negative. Symmetrically, whenever T^b — #s 
becomes smaller than —10, the train is considered late, until #b — #s becomes 
positive. We only consider the part of the system which determines whether the 
train is early of late. In Lustre, the corresponding program fragment could be: 



difF = 0 -> if second and not beacon then pre(difF)-l 

else if beacon and not second then pre(difF)-hl 
else pre(difF): 

early = false -> if difF > 10 then true 

else if difF < 0 then false 
else pre(early); 

late = false -> if difF < -10 then true 

else if difF > 0 then false 
else pre(late): 

This program has 3 Boolean state variables: the auxiliary variable init (ini- 
tially true, and then false forever) and the variables storing the previous val- 
ues of early and late. The corresponding interpreted automaton has the control 
structure shown by Fig 2, and, for instance, the transitions sourced in the state 
“OnTime” are guarded as follows: 

gi'. difF > 10 A difF > —10 ^ Early g^: difF > 10 A difF < —10 ^ EarlyLate 

^ 2 - difF < 10 A difF < —10 ^ Late g^: difF < 10 A difF > —10 ^ OnTime 

Without any knowledge about numerical guards, the model-checker does not 
know that some of these guards {gi and ^ 2 ) can be simplified, nor that one 

of them (gs) is unsatisfiable. This is why the state “EarlyLate” is considered 

reachable. 

A transition the guard of which is numerically unsatisfiable will be called 
statically unfeasible. In our example, if we remove statically unfeasible transi- 
tions, we get the automaton of Eig. 3, where the state “EarlyLate” is no longer 
reachable. A simple way of improving the power of a model-checker is to provide 
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Fig. 3. The subway example without statically unfeasible transitions 




diff<0 -10<diff<10 diff>0 



Fig. 4. The subway example without dynamically unfeasible transitions 



it with the ability of detecting statically unfeasible transitions, in some simple 
cases. For instance, unfeasibility of guards made of linear relations is easy to 
decide^. 

This is why Lesar has been extended with such a decision procedure in 
linear algebra: when a state violating the property is reached by the standard 
mo del- checking algorithm, the tool can look, along the paths leading to this 
state, for transitions guarded by unfeasible linear guards. If all such “bad” paths 
can be cut, the “bad” state is no longer considered reachable. This very par- 
tial improvement significantly increases the number of practical cases where the 
verification succeeds. 

Of course, we are not always able to detect statically unfeasible transitions. 
Moreover, some transitions are unfeasible because of the dynamic behavior of 
numerical variables. For instance, in the automaton of Fig. 3, there are direct 
transitions from state “Early” to state “Late” and conversely. Now, these tran- 
sitions are clearly impossible, since diff varies of at most 1 at each cycle, and 
cannot jump from being > 0 in state “Early” to becoming < —10 in state 
“Late”. Such transitions are called dynamieally unfeasible. Detecting dynami- 
cally unfeasible transitions is much more difficult. We experiment “linear relation 
analysis” [HPR97] — an application of abstract interpretation — to synthesize 
invariant linear relations in each state of the automaton. If the guard of a transi- 
tion is not satisfiable within the invariant of its source state, then the transition 



^ at least for rational solutions; but since unfeasibility in rational numbers implies 
unfeasibility in integers, such an approximate decision is still conservative. 
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is unfeasible. In our example, we get the invariants shown in Fig. 4, which allow 
us to remove all unfeasible transitions. 



5 Automatic Testing 

In spite of the progress of formal verification, testing is and will remain an 
important validation technique. On one hand, the verification of too complex 
systems — with too complex state space, or important numerical aspects — will 
remain unfeasible. On the other hand, some validation problems are out of the 
scope of formal verification: it is the case when parts of the program cannot be 
formally described, because they are unknown or written in low level languages; 
it is also the case when one wants to validate the final system within its actual 
environment. So, verification and testing should be considered as complementary 
techniques. Moreover, testing techniques and tools should be mainly devoted 
to cases where verification either fails or does not apply. This is why we are 
especially interested in techniques that cope with numerical systems, that don’t 
need a formal description of the system under test (black box testing), and the 
cost of which doesn’t depend on the internal complexity of the tested system. 

Intensive testing requires automation, since producing huge test sets by hand 
is extremely expensive and error-prone. Now, it appears that the prerequisite for 
automatic generation of test sets is the same as for verification: an automatic 
tester will need a formal description of both the environment — to generate 
only realistic test cases — and the system under test — to provide an “oracle” 
deciding whether each test passes or fails. In section 2, we proposed the use of 
synchronous observers for these formal descriptions. In the Lurette [RWNH98] 
and Lutess [BORZ98] tools, such observers are used to automatically generate 
and run test sequences. In this section, we explain the principles of this genera- 
tion. 

The specific feature of reactive systems is, of course, that they run in closed 
loop with their environment. In particular, they are often intended to control 
their environment. This means that the current input (from the environment) 
may depend on the past outputs (from the system). In other words, the realism 
of an input sequence does not make sense independently of the corresponding 
output sequence, computed by the system under test. This is why, in our ap- 
proach, test sequences are generated on the fly, as they are submitted to the 
system under test. 

More precisely, we assume that the following components are available: 

— an executable version of the system un- 
der test, say S. We only need to be able 
to run it, step by step. 

— The observers A and P, respectively 
describing the assumptions about the 
environment and the properties to be 
checked during the test. 
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Moreover, the output “realistic” of the observer A is required not to depend 
instantaneously of the outputs “o” of S. Since “o” is supposed to be computed 
from the current input “i”, it would be a kind of causality loop that the realism 
of “i” depend on “o”. 

Basically, the tester only needs to know the source code of the observer A, 
and to be able to run the system S and the observer P, step by step. It considers, 
first, the initial state of A: in this state, the Lustre code of A can be simplified, 
by replacing each expression “ei-> 62” by “ei”, and each expression “pre(e)” by 
“mf . After this simplification, the result “realistic” is a combinational expression 
of the input “i”, say “6(i)”. The satisfaction of the Boolean formula b(i) can be 
viewed as a constraint on the initial inputs to the system. A constraint solver 
— which will be detailed below — is used to randomly select an input vector io 
satisfying this constraint. Now, S is run for a step on zq, producing the output 
vector oq (and changing its internal state). Knowing both zq and oq, one can run 
A and P for a step, to make them change their internal state, and to get the 
oracle “correct” output by P. The Lustre code of A can be simplified according 
to its new state, providing a new constraint on “i”. The same process can be 
repeated as long as the test passes (i.e., P returns “correct = true^^)^ or for a 
given number of steps. 

The considered tools mainly differ in the selection of input vectors satisfying 
a given constraint. In Lutess [BORZ98], one consider only purely Boolean ob- 
servers. A constraint is then a purely Boolean formula, which is represented by 
a Binary Decision Diagram. A correct selection corresponds to a path leading 
to a “true” leaf in this BDD. The tool is able to perform such a selection, ei- 
ther using an equiprobable strategy, or taking into account user-given directives. 
Lurette [RWNH98] is able to solve constraints that are Boolean expressions 
involving Boolean inputs and linear relations on numerical inputs. 



Example: Let us illustrate the generation process on a very simple example. 
Assume S is intended to regulate a physical value r, by constraining its second 
derivative. Initially, both u and its derivative are known to be 0. Then, the second 
derivative of u will be in an interval [—6, around the (previous) output x of 
S. An observer of this behavior can be written as follows: 

node A (u, x: real) returns (realistic: bool); 

var dudt, d2udt2: real; 

let 

dudt = 0 ->(u - pre(u)): 

d2udt2 = dudt - pre(dudt); 

realistic = (u=0) -> ((pre(x) - delta <= d2udt2) 

and (d2udt2 <= pre(x) + delta)); 
tel 

At the first cycle, the code of A is simplified to 

dudt = 0; d2udt2 = nil; realistic = (u=0); 
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There is only one way of satisfying the constraint, by choosing = 0. The 
system S is run for one cycle, with this input value, let xq be the returned value. 
At the second cycle, we know that 

pre(u) = 0 , pre(dudt) = 0 , pre(x) = xq 

So the code of A is simplified to 

dudt = u; d2udt2 = dudt; 

realistic = (xQ-delta <= d2udt2) and (d2udt2 <= Xo+delta); 

which gives the (linear) constraint xq — S < u < xq S. Assume the value 
ui = Xq ^ S 18 selected, and provided to 5, which returns some new value x\. At 
the next cycle, we know that 

pre(u) = pre(dudt) = xq ^ S , pre(x) = xi 

So, the code of A simplifies to 

dudt = u - (xo+delta): d2udt2 = dudt - (xo+delta); 
realistic = (xi-delta <= d2udt2) and (d2udt2 <= xi+delta) 

which gives the constraint xi + 2xq < u < xi ^ 2 xq + 26, and so on... 

6 Conclusion 

We have presented some validation techniques, which mainly derive from the 
specification of properties by synchronous observers. While not being restricted 
to synchronous models, this way of specifying properties is especially natural 
and convenient in that context, since the same kind of language can be used to 
describe the system and its properties. 

Our presentation was centered on the language Lustre, but the techniques 
could be adapted to any synchronous language. Notice, however, that some ideas 
were directly suggested by the declarative nature of Lustre. For instance, syn- 
chronous observers were a natural generalization of the relations in Esterel, 
which are a way of expressing known implications or exclusion between input 
events. When transposed into Lustre, these relations are just special cases of 
invariant Boolean expressions. Generalized to any Boolean Lustre expression, 
this mechanism provides a way of specifying any safety property. Also, in test 
sequence generation, the idea of considering an observer as a (dynamic) con- 
straint is especially natural when the observer is written in Lustre, but can be 
adapted to any synchronous language. 
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Computing devices are clearly proliferating in a variety of domains ranging from 
controlling household appliances, to the cockpits of aircraft. Thus, there is an 
ever increasing demand for cheap and compact computers intended to perform 
a few functions, extremely well. Historically, customization has been the answer 
to this need. However, customization unfortunately implies very high costs — the 
exacerbated costs limit the scope of proliferation of course. To redress this and 
thus enable the extraordinary growth-potential of these emerging domains — 
loosely referred to as embedded systems here — current research in computing 
is revisiting established and stable technologies ranging from (high-level) pro- 
gramming languages at the software end of the spectrum, to gate-level design 
and synthesis at the hardware end. A theme that seems to be emerging, is to 
provide the future application developer with the advantages of customization, 
at costs approaching currently mass-produced commercial-off-the-shelf (COTS) 
software and hardware. Thus, hardware is increasingly being viewed as a flexible 
fabric, amenable to low-cost customization based on the application developer’s 
needs and preferences. A goal of this talk is to outline the “point-technologies” 
that are being innovated to help realize this vision. 

Concretely, let us consider an application development cycle today for a 
COTS microprocessor. Typically, an application developer starts out with a 
program developed in high-level language and then compiles it using an op- 
timizing comp her, into code that runs on hardware with a fixed instruction- 
set- architecture (or ISA). This notion of application development is now being 
revisited, where the fixed aspect of the vendor-specified ISA is no longer consid- 
ered to be essential. Speciflcally, two alternate approaches are being envisioned. 
In the fir si the programmer’s high-level language application is viewed as input 
to a process aimed at designing a customized implementation of key kernels from 
the application that need to be accelerated: examples of such kernels include the 
cosine transform central to MPEG2, the ghostscript part of a postscript appli- 
cation and others. In this setting, a portion of the silicon stays dedicated to the 
executing kernel during its life-time. A second approach envisions processors that 
have a “core” with a fixed vendor-supplied ISA, but with some amount of cus- 
tomizable logic added on. In this context, the goal is to identify and execute key 
kernels from the application as customized implementations that use the custom 
logic; in this approach, the same piece of logic and hence silicon is reused as new 
kernels are encountered. 
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To realize either of the above goals, it is crucial to provide tools and tech- 
nologies that can accept a high-level C (or Java) application, and transform it 
into an optimized implementation in silicon. A significant part of this lecture will 
be devoted to surveying extant and emerging technologies that aim to provide 
such solutions. These “next-generation” technologies start out as an amalgam of 
existing optimizing compilers, as well as CAD tools for VLSI design. To actually 
achieve the goals stated earlier, a number of research challenges emerge, ranging 
over areas in programming languages, computer architecture, VLSI design, and 
hardware-software co-design, to name a few. A significant part of this talk will 
be devoted to surveying this landscape and identifying key research questions 
of interest. The Trimaran (www.trimaran.org) research infrastructures is espe- 
cially well-suited for conducting this research; the talk will conclude with a brief 
overview of this infrastructure. 
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Abstract. We present a name-passing calculus that can be regarded as 
a simplified 7r-calculus equipped with a cryptographic table. The latter is 
a data structure representing the relationships among names. We illus- 
trate how the calculus may be used for modelling cryptographic protocols 
relying on symmetric shared keys and verifying secrecy and authenticity 
properties. Following classical approaches [3], we formulate the verifica- 
tion task as a reachability problem and prove its decidability assuming 
finite principals and bounds on the sorts of the messages synthesized by 
the attacker. 

Keywords: cryptographic protocols, 7r-calculus, verification. 



1 Introduction 

Cryptographic protocols are commonly used to establish secure communication 
channels between distributed principals. Cryptographic protocols seem good can- 
didates for formal verification and several frameworks have been proposed for 
making possible formal and automatable analyses. Formal analyses require for- 
malization of (a) the protocol, (b) the attacker model, and (c) security proper- 
ties, as well as (d) an effective technique to check satisfaction of the properties. 
Addressing the vulnerabilities of the protocol rather than those of the cryptosys- 
tem, a number of approaches assume ‘perfect encryption’ and model a protocol 
as a collection of interacting processes competing against a hostile environment. 
These approaches usually rely either on model-checking techniques (see, e.g.^ 
[5]), or on general-purpose proof assistant tools to establish invariant properties 
(see, e.g, [6]). 

The model-checking approach has been remarkably successful in uncovering 
subtle protocol bugs [5], but its applicability is limited to finite instances of 
protocols with bounds on various parameters such as the number of runs and 
the complexity of the messages. Moreover, they require an explicit modelling 
of the attacker. In contrast, theorem-proving approaches model the attacker’s 
capabilities abstractly and can deal with more general situations, but are not 
fully automatic. 

A more recent trend has been the use of name-passing process calculi for 
studying cryptographic authentication protocols. Abadi and Gordon have pre- 
sented the 5pz-calculus [1], an extension of the 7r-calculus with cryptographic 
primitives. Principals of a protocol are expressed in a 7r-calculus-like notation, 
whereas the attacker is represented implicitly by the process calculus notion of 
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‘environment’. Security properties are modelled in terms of contextual equiva- 
lences, in contrast to previous approaches which are based on the security model 
of, e.g.^ [3]. The spz-calculus provides a precise notation with a formal opera- 
tional semantics, particularly for expressing the generation of fresh names, for 
the scope of names, and for identifying the different threads in a protocol and 
the order of events in each thread. These features are important: in the various 
notations found in the literature, the issues of name generation, scoping, data 
sorts, and synthesis capabilities of the adversary were often treated in an ad hoc 
and/or approximate manner. 

Unfortunately, the addition of the cryptographic primitives to the 7r-calculus 
considerably complicates reasoning about the behaviour of processes. Although 
there have been some attempts to simplify this reasoning (see, e.g., [2]), the 
developed theory has not yet led to automatic or semi-automatic verification 
methods. 

In this paper, we present a simple name-passing process calculus equipped 
with a cryptographic table and show how (symmetric key) cryptographic pro- 
tocols may be modelled. The aim is to develop a framework for automatable 
analyses of security properties with minimal modelling of the attacker. The main 
novelties of our work lie in (a) the use of the table to characterise the sharing 
of information between principals and the environment, and the cryptographic 
capabilities of the environment; and (b) the use of sorts of messages synthesized 
by the environment to limit the state space that needs exploration. 

We depart from the approach of Abadi and Gordon in three major ways. 
First, we insist on considering every transmissible value as a name, thus elimi- 
nating complications to the theory arising from structured values. We keep track 
of the relationships amongst names (what name is a ciphertext of what plaintext) 
by means of a cryptographic table. Secondly, we model secrecy and authenticity 
properties as reachability properties that are largely insensitive to the ordering 
of the actions and to their branching structure. Intuitively, we designate con- 
figurations reached after a successful attack as erroneous. Protocol verification 
then involves showing invariance of the property that such error configurations 
are not reachable. Thirdly, we eliminate named communication channels and let 
all communications between principals and the environment transit on a public 
medium. 

In this framework we show, by a “diagram chasing” method, that the verifi- 
cation problem is decidable provided processes are finite and finite bounds are 
assumed on the sorts of the messages synthesized by the environment. Recent 
work by Huima [4] seems to suggest that this last hypothesis can be removed. 



2 The Calculus 



We define a process calculus enriched with a ‘cryptographic table’ to model 
and analyse symmetric key cryptographic protocols. We use a, 6, . . . for names 
and a, b, . . . for vectors of names. N denotes the set of names. In the sequel. 
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principals’ behaviour as well as secrecy and authenticity annotations will be 
represented as processes. 

Definition 1 (processes). A process (typically p^q) is defined hy the following 
grammar: 

p ::= 0 I err \\a.p \?a.p | (ua) p | [a = b]p, ^ | p | ^ | A{a) 

I let a = {b}c in p | case {b}c = a in p . 

As usual, 0 is the terminated process; err is a distinguished ‘error’ process; 
\a.p sends a to the environment and becomes p; ?a.p receives a name from the 
environment, binds it to a and becomes p; {na) p creates a new restricted name a 
and becomes p; [a = 6]p, q tests the equality of a and b and accordingly executes 
p or p; p I p is the parallel composition of p and g; A (a) is a recursively defined 
process; let a = {b}c in p defines a to be the encryption of b with key c in p; 
finally, case {b}c = a in p defines b to be the decryption of a with key c in p. 
The input and restriction operators act as name binders. Moreover, a is bound 
in let a = {b}c in p and the names b are bound in case {b}c = a in p. We denote 
with fn{p) the set of names free in p. We assume that for every process identifier 
A(a) there is a unique recursive equation A(a) = p such that /n(p) C {a}. 

Let T be a relation in (Ua;>i x N x N. We write a G n{T) if the name a 
occurs in a tuple of the relation T. We write (b, c, a) G T as {b}c = a ^ T. This 
notation is supposed to suggest that c is a key, b is a tuple of plaintext, and a is 
the corresponding ciphertext. The relation T induces a strict order <p on n{T) 
which we define as the least transitive relation such that: {6i, . . . , bn\c = a G 
T 6i, . . . , bfi^ c <p a. 

Definition 2 (cryptographic table). A cryptographic table T is a relation in 
X N X N which satisfies the following properties: T is finite^ <t is 

acyclic, 

{b}c = a ^ T and {b}c = a' ^ T ^ a = a' (T is single valued), 

{b}c = a ^ T and {b'}c' = a ^ T ^b = b' and c = c' (T is injeetive) . 

We introduce a notion of sort for the names in a cryptographic table. 

Definition 3 (sorts). The collection of sorts Srt is the least set that contains 
the ground sort 0 and such that (si, . . . , Sn) G Srt if Si G Srt for i = 1, . . . , n 
with n> 2. 

Every name occurring in a cryptographic table can be assigned a unique sort. 

Definition 4 (sorting). Let T he a cryptographic table. We define a function 
srtr : n{T) Srt as follows: 

. . _ J 0 if a is minimal in <t 

sr T [a) I ^ 5 ^) if {bi,. , bn-i}bn = a eT, srtrih) = Si, i = 1, . . . , n . 

Intuitively, a sort describes the shape of a message represented by a name. 
For example, if a has sort (0, (0,0)), it represents a message of the form {b}k^ 
where k itself is of the form The notion of sort plays an important role in 

the development of the results of section 4, particularly in bounding the state 
space that needs exploration. 




18 



Roberto M. Amadio and Sanjiva Prasad 



Definitions (configuration). A configuration r is a triple {p \ T) 

where {a} is a set of restrieted names, p is a proeess, and T is a cryptographic 
table. 



We write r = r' if r and r' are identical configurations up to o-renaming of 
bound names and associativity-commutativity of parallel composition, with 0 as 
its identity. 

We define a reduetion relation on configurations. The first five rules describe 
the non-cryptographic computation performed by a process. Rules {out) and 
{in) concern communication: the sending of a name to the environment and the 
reception of a name from the environment. Rules {n), (m), and (rec) describe 
internal computation: generation of new names, conditional, and unfolding of 
recursive definitions. 



{out) (z^{a}) {\a.p \q\T) ^ (^^{a}\a) {p \ q \ T) 



{in) (z^{a}) (?a.p \ q \ T) ^ {[b/a]p \ q \ T) if b ^ {a.} 

{a) {a{a}) {{aa)p \q\T) ^ {iy{a} U {a}) {p\q\T) a^ fn{q) U n(T) 



(m) (z^{a}) {[a = %i,p 2 \ q \ T) 



/ (^{a}) {pi \q\T) if a = b 
\ (^{a}) {p 2 \q\T)if afb 



{rec) (z^{a}) (A(b) \q\T) ^ (^^{a}) ([b/c]p \ q \ T) if A(c) = p 



The cryptographic table plays a role in the next three rules. Note that the 
table allows sharing of information between principals and environments. The 
rules {let^) and {lei) compute the ciphertext a' associated with {b}c in T while 
adding {b}c = a' to T if it is not already there. The rule {case) tries to decode 
the ciphertext a with key c. In the rule {ease), a deadlock occurs (specified by 
the absence of a transition) if either the vectors b and b' do not have the same 
length or an incorrect key is used for decoding. 

{lef) (z^{a}) (let a = {b}c \n p \ q \ T) ^ (^^{a}) {[a' /a]p \ q \ T) 

if {b}c = a' eT 

{let^) (z^{a}) (let a = {b}c \n p \ q \ T) ^ (^^{a} U {a'}) {[a' /a]p \q\TU {{b}c = a'}) 
if a' is fresh and ({bjc = a" G T). 

{case) (z^{a}) (case {b}c = am p\q\T) ^ (^^{a}) ([b'/b]p \ q \ T) 
if {b'}e = a G T 



Finally, the last three rules {let^), {let‘1), and {casce) describe the encod- 
ing/synthesis and decoding/analysis performed by the environment: in the rule 
{letl) the environment learns a private name by encoding, in the rule {let‘1) the 
environment creates a new ciphertext, and in the rule {casce) the environment 
learns new names by decoding. 
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(letl) (z^{a, a}) (p\TU {{b}c = a}) (z^{a}) (p | T U {{b}c = a}) 

if {b, c} n {a, a} = 0 and a ^ {a} 

{letl) (z^{a}) (p I T) ^ (z^{a}) (p | T U {{b}c = a}) 

if {b, c} n {a} = 0, a is fresh, and ^ a' ({b}c = a' ^ T) 

(casee) (z^{a}) (p | T U {{b}c = a}) ^ (z^{a}\{b}) (p | T U {{b}c = a}) 
if {a, c} n {a} = 0 and {b} fl {a} ^ 0 

We remark that (i) a very general treatment of the capabilities of the envi- 
ronment is obtained from the rules {out)^ {in) and the last three rules and (ii) 
the attacker knows all the names not explicitly restricted by a i/-operator. We 
write r r' to make explicit the reduction rule R being applied. We note 
that encoding a new tuple (plaint ext, key) generates a new ciphertext. The side 
conditions on the only rules that modify the table — {let^) and {let‘1) — ensure 
that the acyclicity and injective function properties of the cryptographic table 
are preserved by reduction. 

Lemma 1. The set of configurations is closed under reduction. 

Definition 6 (error). A configuration with error is a configuration having the 
shape: (i^{a}) {err | p | T). 

We write r | err if r is a configuration with error (read r commits on err) 
and r err if r r' and r' | err. We note that configurations with errors are 
closed under reduction. In the sequel we will use the process err to flag various 
undesirable outcomes such as leakage of secrets or improper authentication, and 
so we will be interested in deciding whether r 1=^ err. 

We briefly comment on the relationship with the 7r-calculus. The main differ- 
ences are: (i) We let all communications go through a unique (unnamed) channel 
that connects the principals to the environment, (ii) We add cryptographic prim- 
itives, which affect the contents of the cryptographic table. In principle, we can 
code this process calculus in a variety of 7r-calculus. This amounts to: (i) Deco- 
rating all input-output actions with fresh global channels — thus replacing ! b.p 
with ab.p and likewise Ib.p with a{b).p^ where a is a fresh name, (ii) Representing 
the cryptographic table as a process that receives messages on a global channel, 
say c. The coding and decoding operations are represented as remote procedure 
calls from the principals and the environment to the cryptographic process. To 
make sure that messages are not intercepted, we assume that the cryptographic 
process is the unique receiver on the channel c. We refrain from going into this 
development because it seems much more effective, both in the mathematical 
development and in the practical applications, to expose directly the structure 
of the cryptographic table. 

Finally, we remark that the reduction rules can be easily turned into a la- 
belled transition system whose actions (internal reduction, input, free and bound 
output) are inherited from the 7r-calculus. We can then rely on the 7r-calculus 
notion of bisimulation to reason about the equivalence of configurations. Thus 
our approach does not preclude the expression of security properties based on 
process equivalence [1]. 
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Some syntactic sugar. We now describe how several concepts may be encoded 
in the core calculus given above. We first describe several abbreviations that 
improve readability. We sometimes use multiple abbreviations if the order of 
expanding them out is apparent from the context. 

Sending or receiving a tuple of names can be encoded in the calculus with 
monadic communication by considering tuples as ciphertexts encoded with a 
distinguished globally-known key g: 

\{b).p = let a = {b}g in \a.p ?(h).p =?a.case {b}g = a \n p . 

A ciphertext {b}c may appear as a component of the output tuple, or as a value 
in a match, in an encoding or as a parameter. In all these cases it is intended 
that {b}c stands for the name a resulting from the encoding let a = {b}c in . . . 
For instance: 

!(6', {h}c, . . O-P = let 6 = {b}c in \{b ' . . .).p 

[a {b}c]p, q = let 6 = {b}c in [a b]p, q 

let b = {6', {h}c, . . .}c' in p = let a = {b}c in let b = {h', a, . . .}c' in p 

A(a', {h}c, . . .) = let a = {b}c in A(a', a, . . .) . 

In a filtered input, we check that the input has a component that is equal to a 
certain value (marked as b) or that has a certain shape, e.g. {6}c, and we stop 
otherwise. 

?(h', . . .).p =?(h', c, . . .).[c = %, 0 

?(h', {h}c, . . .).p =?(h',6, . . .).case {b}c = b \n p . 

As in the filtered input, we check that the decryption of the ciphertext yields a 
certain component. 

case {h', 6, . . .}c = a in p = case x, . . .}c = a in [x = b]p, 0 

case {h', {c}d, • • -}c' = a' \n p = case {h', a, . . .}c' = a in case {c}d = a in p . 



Security property annotations. Next we describe annotations with which we 
decorate protocols when analysing their secrecy and integrity properties. We 
must clarify that these annotations (and the auxiliary processes arising from 
them) are not part of the protocol, and find use only in verification, where we 
analyse the translation (image) of an annotated protocol. 

We mark the generation of a name a intended to remain secret in a protocol 
configuration with the annotation {yaY^^ p. We can easily program an observer 
W{a) such that if the environment ever discovers the name a, then an error 
configuration is reachable: 

W{a) =la[a = a']err,0 (secrecy observer). (1) 

Secrecy annotations are then translated as follows: 

p = {]ya) (p | W{a)) (secrecy annotation). (2) 

In order to program an observer for authenticity properties we need a limited 
form of private channel. Fortunately, a private key a already allows the encoding 
of a private use-at-most-once channel a as follows: 

ab.p =\{b}a-p a{b).p =?{b}a-P • 



(3) 
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We note that this encoding does not work for arbitrary (multiple-use) chan- 
nels, since the environment may replay messages, and without mechanisms like 
time-stamping, it would not be possible to differentiate replayed messages from 
genuine fresh messages. However, the encoding can be generalised to a bounded 
use channel, if each message instance contains distinguished components that 
uniquely index each distinct use of the channel. 

To specify authenticity properties we mark certain send and receive actions in 
the principals with authenticity annotations auth-o{^^ authJ{J) which represent, 
respectively, a sender x registering a message m with a ‘judge’ process along a 
private channel j (in the sense of (3)) prior to sending that message to and 
the receiver y claiming authenticity of a received message (purportedly sent by 
X, which it has presumably authenticated). Note that the ‘judge’ Jx,y^ rules on 
the authenticity of a single message allegedly sent by x to 

auth.o{{x,y,m)).p = fyn) j(o, {x,y,m),n).j\'^.p 
authA{{x,y,m)).p = j(i, (x, y, m), _).p _ 

Jx,y = j{d, (g,t,m),n)[d = o](j'n. (m)), {[d = i](err,0)) 

J'x,y{m) = j{d, (^,t',m'),_).[d \]{[m' m]0, err), J'x,y{m) . 

Here we assume two distinguished names o and i, which indicate whether the 
authenticity of a message is being registered or claimed. Communication with 
the judge is over restricted channels to disallow the environment from mak- 
ing bogus assertions of authenticity. Our encodings will ensure the channel j 
appears in the principals to which x, y are instanced, and j' only in the principal 
corresponding to x. Note that although j is used twice, the different uses are 
distinguishable by the names o and i, and so replays can be recognised. 

Relying on these encodings we translate authenticity annotations as follows: 

!(h)""'Np = auth.o(b).\(b).p l(by^^^.p =l(b).authJ(b).p . 

We observe that in an authenticated output, the principal receives an acknowl- 
edgement from the judge process on channel j' before actually outputting the 
message. Since we assume processes do not fail, our encoding can rely on the 
assumption that if a principal registers a message with a judge, it will send that 
message. 

3 Modelling Cryptographic Protocols 

We now illustrate how a protocol specified informally can be transformed into an 
annotated process in the enriched notation of §2. A verification of the protocol 
is beyond the scope of this paper. We consider a symmetric key protocol due to 
Yahalom. This protocol concerns principals a, 6, and a trusted server c running 
in a possibly hostile environment that can intercept all communications. Initially, 
principal a and principal b each share a symmetric secret key with c. At the end 
of the protocol, principals a, b (and c) share a third symmetric secret key, a 
‘session key’, which can be used by principals a and b to exchange information. 
A run of the protocol is informally described by the following list of events: 
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(1) a ^ b : 


a,b, na 


(4) a ^ b : 


b, {a,kab}kt,cJ 


{2) b ^ c : 


b, {a,na,nb}k^. 


(5) a ^ b : 




(3) c ^ a : 


a, ^b, kab .1 na ^ nb^ kac 


, {a,kab}kt,a (5') b ^ a : 


b,a,{d}k,, . 



Principal a sends a dear-text message containing a nonce challenge to b. Instead 
of responding directly to a, b generates a new nonce rib, its response to a’s 
challenge, which, added to components of the original message, is sent encrypted 
to c. Server c creates a secret key kab^ which is placed in two separately encrypted 
message pieces that are sent to a: the first part, readable by a, contains a’s 
original challenge, 6’s retort rib and the shared secret kab- The other part, not 
readable by a but by b, contains the same shared secret and ds identity; it is 
forwarded by a in event 4 to b, together with ds challenge encrypted with this new 
shared secret. We have explicitly mentioned the intended recipients in messages 
1, 3, and 4. We have also shown (two possibilities of) the first post-protocol 
message 5 (5') in which datum d is sent encrypted using the new session key kab- 
The secrecy property we would like to verify is that the keys kbc^ and kab 
remain secret. The authenticity property that we would like to verify is that the 
message successfully received by b at the second part of event 5 (alternatively 
by a in 5') of the protocol is the same as the one sent by a (respectively, by b in 
5'). An informal justification of the protocol is that following event 3, principal 
a will believe it is interacting with principal b if the third element of the message 
component encrypted with kac is the same as the nonce Ua that it had generated 
in event 1. Following event 4, principal b will believe it has authenticated and 
established a secure channel with principal a, provided the nonce Ub encrypted 
with the received key kab is equal to the nonce generated at event 2. 

We will model a system q(a,b,c) consisting of processes Pa, Pb^ and Pc repre- 
senting principals a, b and c, which are assumed to follow the protocol honestly. 
Verifying that a session between a and b cannot be attacked means showing that 
the system q(a,b,c) can never evolve into a configuration with error. 

From the message sequence chart above we can extract the sequence of events 
where a principal acts as a sender or a receiver, as well as the name generation, 
cryptographic and matching operations that it performs. The process Pa, for the 
first alternative of the first post-protocol event, is then specified as follows: 

( 1 ) {nua) (l(a,b, no). 

(3) {k^k,'^,n}kac^y)- 

(4) \(b,y, {n}k). 

(5) !(a,6,{4fc)““*'‘-0) . 

The behaviours pb and Pc are defined in a similar way. The process we consider 
for analysis is: 

q{a,b,c) = (nkacY^'' (nkbcY^'' (nj) (nj) (jpa \ph\pc\ Ja,h \ 0) 

where 0 is the empty cryptographic table. Keys kac and kbc being long-term 
secrets, we restrict these names at the top-level. To indicate that the authenticity 
judge is observing a session involving a and b, we place a judge process Ja,6 in 
parallel with the principals and restrict the names j,j'. What we have presented 
is only illustrative: we also have to consider a similar process with event 5' instead 
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of event 5, and with the judge general, to consider the protocol operating 

in more complicated scenarios, we can emend q by enriching the contexts in which 
the principals are placed. For instance, allowing the environment to attempt 
replay attacks using messages from other sessions may be achieved by starting 
with a non-empty table. 

4 Reachability 

In the previous sections, we expressed secrecy and authenticity properties as 
reachability properties. We now present some results on the problem of deter- 
mining whether a configuration can reach one with error. As usual, a name 
substitution a is a function on names that is the identity almost everywhere. 

Definition 7 (injective renaming). Let r and r' he configurations. We write 
r = r' if there is an injective substitution a such that ar = r' . 

We study reachability modulo injective renaming. 

Lemma 2. (1) The relation = is reflexive^ symmetric^ and transitive. 

(2) If r = r' then r [ err iff fi [ err. 

(3) If r = r' and r r\ then 3 r' ^r r[ and r\ = rf 
We consider rewriting modulo injective renaming. 

Lemma 3. Let r = (i/{a}) fp \T) he a configuration. Then: 

(1) The set {r' \ r ^r fi} where R is not {let^fi) is finite modulo injective 
renaming. 

(2) Given a sort s^ the set of configurations to which r may reduce by (let^fi), 
while introducing a name of sort s in the cryptographic table ^ is finite modulo 
injeetive renaming. 

Therefore, if we can bound the sorts in the cryptographic table then the reduction 
relation defined in section 2 is finitely branching modulo injective renaming. 

A second important result is that all reduction rules except input are strongly 
confluent modulo injective renaming. 

Lemma 4. Let r be a configuration and suppose r ^ r ^ r\ and r ^ r ^ ^2 where 
Ri is not the input rule. Then 

3ri,r2 (ri rj,r 2 r 2 , and r[ ^ r^) 

where R '2 in not the input rule and indicates reduction m 0 or 1 steps. 

The rule {let‘1) can be postponed except in certain particular cases. 
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Lemma 5. Suppose r is a configuration and 

r = (j^a) {p I T) ^(^(2 r' = (lya) {p\TU {{b}o = a}) r” 

where {b}c = a' is the tuple introduced hy the first reduction. Then in the 
following cases the first reduction ifet^fi) can he postponed or eliminated: 

(1) If R = (in) and the name taken in input is not a' then 

3n,r2 (r ri ^^^^2 r 2 ) and r" = r 2 . 

(2) If R = {let‘1) and r" = (i^a) (p | T U {{b}c = a', {bi}ci = ^'i\) where 

{bi}ci = tuple introdueed by the seeond reduetion and this tuple does 

not depend on a', i.e., a' ^ {bi , Cl}. Then the two {let‘1) reduetions ean he 
permuted: 

r fia) (p I T U {{bijci = «i}) ^ • 

(3) If R = {leti) and we have 

r' = fia) (jp" I let a = {b}c in p' | T U {{b}c = a'}) ^let^ 
r" = fia){p"\[a'/a]p'\TU{{h}. = a'}). 

Then the {let‘1) reduetion can he eliminated as follows: 

r ^let^ {na,a) {[a /a]p \ p" \ T U {{b}c = a}) r" 

(4) For all eases of R other than {in), {let‘1) or {leti): 3ri,r2 (r r\ 
r 2 ) and r" = r 2 - 

In the remaining cases the name a' affects the following rule, so that {let‘1) 
reduction cannot be postponed. Next, we introduce a measure d{s) of a sort 
s which will provide an upper bound on the number of (let‘1) reductions that 
might be needed for the synthesis of a name. 

Definition 8 (sort measure). We define a measure d on sorts as follows: 
d{0) = 0 d(si, . . . , Sn) = 1 + d{si) + . . . + d{sn) . 

Lemma 6. Suppose r\ ^letl ' ' ' ^letl ^n+i ^n +2 where the name received 
in input in the last reduetion has sort s and n > d{s). Then at least n — d{s) 
{let‘1) reduetions ean he postponed modulo injeetive renaming, i.e. 

= '^1 ^letl ' ' ' ^letl ^d(s) + l ^in r d{s) + 2 ^let^ ' ' ' ^letl ^n+2 

and r ^+2 - ^n +2 • 

Proofhint. The construction of a name a of sort s taken in input needs at most 
d{s) {let‘1) reductions. All the other {let‘1) reductions can be moved to the right 
of the input by iterated application of the lemma 5(1-2). O 

To summarise, (let‘1) reductions can be postponed except when they are 
needed in the construction of a name to be input. In this case, the number of 
needed {let‘1) reductions is bounded by d{s) if s is the sort of the input name. 




The Game of the Name in Cryptographic Tables 



25 



Next, let us concentrate on the reachability problem in the case where all 
principals are finite processes (in practical applications this is often the case). We 
note that the secrecy and authenticity annotations compile to finite processes. 

Definition 9 (configuration measure). We define the measure of a configu- 
ration r = (i^{a}) {p \ T) as the pair (|p|, |a|) where \p\ is the size of the process 
and |a| is the cardinality of {a}. 

Lemma 7. (1) Rules {out), {in), {n), {m), {let^), {let‘^), and {case) decrease 
the size of the process \p\. 

(2) Rules {letl) and {casce) decrease the size of the restricted names |a| while 
leaving unchanged the size of the processes. 

(3) Rule {let‘1) leaves the measure (|p|, |a|) unchanged. 

(4) If r ^letl r' [ err then r [ err. 

In the following, we concentrate on the issue of deciding reachability of a con- 
figuration with error assuming that for every input we can compute a finite and 
complete set of sorts which is defined as follows. 

Definition 10 (complete set of sorts). Given a configuration r = {j/a) {la.p \ 
q\T) we say that a set of sorts S is complete for the input la.p if whenever there 
is a reduction sequence starting from r leading to err and whose first reduction 
is performed by la.p, there is a reduction sequence leading to err where the name 
taken in input has a sort in S. 

The problem of determining tight bounds on a complete set of sorts is not 
trivial. Consider p = {nk) {la.\{a}k \!{{c}k'}kerr). It is easily checked that the 
set {0} is not complete for the input ?a.!{a}^, but the set {(0, 0)} is. Nevertheless, 
if a finite complete set of sorts can be computed then the following strategy 
decides if r can reach err. 

1. Perform in an arbitrary order the reductions different from {let‘1) and {in). 

2. Analyse the current configuration: 

(a) err has been reached: stop and report that err is reachable. 

(b) err has not been reached and no {in) reductions are possible: backtrack if 
possible, otherwise report that err is not reachable and stop. 

(c) Otherwise: 

i. Non-deterministically select an input action and compute a finite complete 
set of sorts for it, say {si, . . . , Sn}. 

ii. Non-deterministically perform a sequence of {let^) reductions of length at 
most max{d{si ), . . . , d(sn)}- 

iii. Non-deterministically select an input for the input action. Goto step 1. 



Theorem 1. Starting from, a configuration r the strategy above terminates and 
it will report an error iff r err. 
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Proofhint. To show termination, note that every loop from step 1 to step 
2(c) (m) decreases the well-founded measure of definition 9. 

(=>) The strategy examines a subset of the reachable configurations and there- 
fore it is obviously sound. 

(<^) Let r be the initial configuration. The rewriting in step 1 terminates in 
a configuration r' by lemma 7. By iterated application of lemma 4 and lemma 
2, if r 1=^ err then r' err. In step 2(6), if we have not reached err then we 
can safely claim by lemma 7(4) that err is not reachable. In step 2(c)(zz), we 
know, by lemma 6, that if there is a sequence that leads to error then there is a 
sequence that leads to error whose initial sequence of {let‘1) is bounded by the 
sorts’ measure. By lemma 3(2), there is a finite number of sequences of {let‘1) 
reductions of a given length (modulo injective renaming). Thus there are a finite 
number of cases to consider. In step 2(c) (m), we apply again lemma 3(1) to 
conclude that there are a finite number of cases to consider modulo injective 
renaming. O 

Future work. We need to explore whether there are techniques to dat ermine 
bounds on the sorts of input names. Even assuming bounds on sorts of inputs, 
we need to study the practical feasibility of our technique and its complexity. 
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Abstract. We deal with the maximum cut problem on cubic graphs 
and we present a simple 0(log n) time parallel algorithm, running on a 
CRCW PRAM with 0(n) processors. The approximation ratio of our 
algorithm is 1.3 and improves the best known parallel approximation 
ratio, i.e. 2, in the special case of cubic graphs. 



Keywords: Cubic graphs, Max Cut, NP-completeness, PRAM Model. 

1 Introduction and Notation 

In this paper we deal with the maximum cut problem^ that can be formally stated 
as follows: 

INSTANCE: An undirected n-vertex graph G{V,E). 

SOLUTION: A partition of V into two disjoint sets Vr (right side class) and Vi 
(left side class). 

MEASURE: The number of edges with one endpoint in Vi and one endpoint in 
i.e. the cardinality of Eg = {{u^v) such that either u e V\ and u G Ur or 
u eVr and v eVi}. 

The problem of finding a maximum cut of a given graph is, in general, NP- 
complete [6, 7] and has been deeply studied (see, for example, [3, 4, 5, 11, 12, 
14, 16, 17]). 

In this paper we point out our attention on the special class of cubic graphs 
[1, 9]. Even restricted to this class, the maximum cut problem does not become 
easier, since it has been proved to be NP-complete if the graph is triangle- free and 
is at most cubic [18] and to be APX-complete, even if the degree of G is bounded 
by a constant [15]. Here we present new theoretical results characterizing the 
cardinality of the cut in cubic graphs with respect to the degree of vertices in 
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the graph {V^Eg). These results make it possible to design a simple O(logn) 
time parallel algorithm, running on a CRCW PRAM with 0{n) processors. 

The best known approximation ratio for the maximum cut problem in parallel 
is 2 and follows from [13]. Our results improve it in the special case of cubic 
graphs, since our algorithm achieves an approximation ratio 1.3. So, this parallel 
algorithm approaches the best known sequential approximation ratio, that is 
1.138 proved in [ 8 ]. It is worthwhile to note that the sequential algorithm for 
Max Cut presented in [ 8 ] is based on the use of the primal-dual technique and 
its parallelization seems not to be easy. 

The remainder of this paper is organized as follows. Section 2 considers the 
problem from a theoretical point of view, while Section 3 addresses the design 
of the parallel approximation algorithm. Conclusions and open problems are 
presented in Section 4. 

In the sequel, we denote with B the bipartite graph B(ViUVr^Es) and with 
Ed the set E — Eg ^ i.e. Ed = v) s.t. either u^v e Vi oi u^v e Vr}. From now 
on we call edges in Eg and Ed solid and dotted edges, respectively. 

We partition the vertices of V/ (W) according to their solid degree^ that is 
their degree in B. In particular, Vi = LqULiUL 2 ULs and = R 0 UR 1 UR 2 UR 3 , 
where Li and Ri are the sets of vertices of solid degree i = 0, 1, 2, 3 in V/ and W, 
respectively. We also denote with li (r^) the cardinality of Li {Ri). 

2 Some Theoretical Results 

The aim of this section is to give sufficient conditions on Vi and W in order to 
guarantee the approximation ratio obtained in this work. 

In the following we state some results referring to V/, but - for symmetry 
properties - it will be always possible to exchange the roles of Vi and W- 

First of all, observe that it is always possible to modify the partition so that 
no vertex has solid degree 0 ; indeed, if such a vertex exists, it is enough to 
move it from its class to the other one. Therefore, it is not restrictive to suppose 
^0 = ^0 = 0 - 

Moreover, under the condition li = I 2 = 0, we can trivially deduce the 
following equations which will be useful for the rest of the section: 

1. 3/3 = 3r3 + 2 v 2 + ri, derived by counting the number of solid edges as sum 

of solid degrees of vertices in Vi and W- 

2 . n = ^3 + ri + r 2 + T 3 , derived by counting the total number of vertices. 



Lemma 1. If Vi contains only vertices of solid degree 3, i.e. Vi = T 3 , then 



Eih El 



Proof. The condition is easily deduced from Equations 1 and 2. In particular, in 
order to obtain the lower bound we isolate ri from Equation 2 and we substitute 
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it in Equation 1 . For what concerns the upper bound, we isolate from Equation 
2 and we substitute it in Equation 1 . 

It is to notice that lower and upper bounds on Is hold when V2 = rs = 0 and 
ri = T2 = 0, respectively. 



Lemma 2. If Vi contains only vertices of solid degree 3, i.e. Vi = Ls, then 
ri > 2 n — 5/3. 

Proof. From Equation 1 and Equation 2 it follows that r2 -h rs = ^^3+^2-ri 
that ri = n — Is — (^2 + ^3), respectively. 

By substituting the first equality in the second one, simple calculations lead 
to ri = |n — 3/3 — In this latter equality we substitute r2 < n — Is — r\^ 
obtained by Equation 2 and by the fact that r3 > 0 . 

The inequality in the statement follows immediately. 

Let c be the maximum number of pairwise vertex-disjoint odd length cycles 
in G and let Cr be the number of odd length cycles in the subgraph of G induced 
by Vr. Note that, due to the structure of the partition of vertices, all cycles in Vr 
are vertex-disjoint. Trivially, c > c^. Moreover, let us remind that the odd girth 
^ of a graph G is defined as the length a shortest odd cycle (if any) in G. 

Lemma 3 . For any partition (V/, Vr) of the vertices of G it holds: 0 < < y . 

Proof. It is easy to see that only vertices of R\ can be involved in cycles in 
the subgraph induced by Vr. Moreover, this subgraph is a collection of isolated 
vertices, simple paths and cycles. The upper bound for Cr holds when all the 
connected components of the subgraph induced by Vr are odd length cycles, 
each having length g. 



3 A Parallel Algorithm for Finding a ‘^Large” Cut 

In this section we present a parallel algorithm for finding a “large” cut of a cubic 
graph. 

The idea behind our algorithm is to find any bipartition of the vertices and 
to increase the number of edges in the cut by appropriately moving vertices from 
one side of the bipartition to the other. In particular, we move to the opposite 
side both vertices of solid degree 0 and vertices of solid degree 1; indeed, as 
shown in Figure 1 , each time we transfer a 0 -degree vertex it becomes a 3 -degree 
vertex and the number of solid edges increases by 3 and each time we transfer 
a 1-degree vertex it becomes a 2-degree vertex and the number of solid edges 
increases by 1. 

Before detailing the algorithm for approximating the maximum cut, we pre- 
sent some preliminary procedures for coloring vertices that will be used in the 
following. 




30 



T. Calamoneri et al. 




Lemma 4. Let G(V, E) he a not neeessarily eonneeted graph of maximum degree 
2. Then 0(log n) parallel steps, using 0{n) proeessors on a EREW PRAM model, 
are sufficient to find a 3-eoloring of G, where the number of vertices having eolor 
3 is as small as possible (possibly 0 ). 

Proof. Observe that G{V, E) is a collection of isolated vertices, paths and cycles. 

It is not difficult to recognize all the connected components and to decide 
whether they are vertices, paths or cycles by using the pointer jumping technique 
[ 10 ]. 

Then we work on each connected component as follows: 



— If the connected component is an isolated vertex, we give it color 1. 

— If the connected component is a path, we find a 2-coloring using the pointer 
jumping technique. Namely, we start from any vertex u, we assign to v color 
1 and to its neighbors color 2; in the general step i of the pointer jumping 
loop, we assign the color of vertex j to not yet colored vertices at distance 
2^ from j. First observe that after [logn] — 1 iterations each vertex has a 
color, as guaranteed by the pointer jumping technique. The found coloring 
is trivially a 2-coloring and it is valid. Indeed, vertex v assigns its color (that 
is, 1) to all vertices with even distance from it, while u’s adjacent vertices 
assign their color (that is, 2) to all vertices with even distance from them, 
and so with odd distance from v. It follows that adjacent vertices cannot 
have the same color. 
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— If the connected component is a cycle, we find a 3-coloring such that color 3 
is assigned to at most one vertex. We choose at random an edge (n, w) of the 
cycle and we remove it; what remains is obviously a path and the previous 
procedure can be run. If in the output u and w have the same color, then we 
set c{w) = 3. The proof of correctness is very similar to the previous one. 



For what concerns the parallel complexity, the pointer jumping technique 
guarantees a O(logn) time using 0(n) processors on a EREW PRAM model. 

We want to underline that although there exist more efficient algorithms to 
find a 3-coloring of a cycle [2], the previous algorithm guarantees that at most 
one vertex receives color 3. 

Before stating the next lemma, we need to introduce the concept of inde- 
pendent dominating set of a graph. An independent dominating set of a graph 
G(y, E) is a subset V' CV such that for any vertex u e V —V' there is a vertex 
V e V' for which (u, v) e E, and such that no two vertices in V' are joined by 
an edge in E. 

Lemma 5. Let GiV^ E) he a graph of maximum degree 3. Then O(logn) parallel 
steps, using 0{n) processors on a CRCW PRAM model, are sufficient to find an 
independent dominating set S for G. 

Proof Assume w.l.o.g. that G is connected (otherwise we apply the same argu- 
ment for each connected component of G). 

We build an independent dominating set S as follows. First, we find a rooted 
spanning tree T of G and assign to each vertex its level in T. Then, we consider 
the subgraph induced by the vertices at odd level; as it has maximum degree 2, 
we can 3-color it according to the algorithm in the proof of Lemma 4. We put in 
S all vertices colored 1 and we delete from G both them and their neighbors. The 
remaining vertices are a subset of vertices at even level in T and have maximum 
degree 2. We 3-color the graph induced by these vertices and we add to S all 
vertices colored 1. 

We now prove the correctness of the algorithm. 

S is an independent set. First, a 3- coloring of the subgraph induced by all the 
vertices at odd level is found and color 1 is an independent set S' for it. Then, 
the subgraph induced by all the vertices at even level that are not adjacent to 
any vertex in S' is considered. By 3-coloring its vertices and considering color 
1, we construct an independent set S" for this subgraph. S = S' A S" is an 
independent set because, by construction, no vertex in S" is adjacent to any 
vertex in S' . 

5 is a dominating set. The coloring algorithms ensure that each vertex colored 
2 is adjacent to at least one vertex colored 1, and that each vertex colored 3 is 
adjacent to exactly one vertex colored 1 and one vertex colored 2. Moreover, 
each vertex at even level deleted after the first coloring is adjacent to at least 
a vertex colored 1. This guarantees that each vertex in V — S is adjacent to at 
least one vertex colored 1. 
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For what concerns the complexity and the PRAM model, let us consider 
each step separately. Finding a spanning tree requires O(logn) time using 0{n) 
processors on a CRCW PRAM [10]. Rooting the tree and leveling its vertices 
can be done through the Euler Tour technique [10] in O(logn) time using 0{n) 
processors on a EREW PRAM. Einding connected components and coloring 
them requires O(logn) time with 0{n) processors on a EREW PRAM in view 
of Lemma 4. All the other tests and operations are performed in constant time. 

Now we are ready to describe the parallel algorithm for finding a “large” cut. 
As already stated, we start by any bipartition of the vertices. The first step 
consists in eliminating vertices of solid degree 0, if they exist. Eurthermore, in 
order to satisfy the hypotheses of Lemma 1 and Lemma 2, we move some vertices 
from Vi to Vr in order to obtain Vi = Ls. Observe that \Es \ may decrease during 
this step, but we need it for obtaining a stronger structure of the bipartite graph 
B. 

At this point we eliminate 1-degree vertices from W; unfortunately, we are 
not able to ensure the extinction of 1-degree vertices from the whole graph, 
because they can be generated in V\ , although they completely disappear from 
Vr. We can however guarantee that the performance ratio of our algorithm is 

1.3. 

In the following we give the headlines of our algorithm and then we detail it 
step by step. 

ALGORITHM Parallel-Approx-Max-Cut (G) ; 

Input: a cubic graph G(V, E) with V = {0, 1, . . . , n — 1} 

Output: A bipartition {Viyr) of the vertices of G such that the Max Cut is 
approximated by a ratio of 1.3 

Step 1: 

Vi ^ {v such that v is even} 

Vr ^ {v such that v is odd} 

Eliminate-0-Degree(V/,lfy) 

Step 2: 

Make-Left-Side-Of-Degree-3(V/,W) 

Eliminate-0-Degree(V/,W) 

Step 3: 

Eliminate- l-Degree-Erom-Right-Side(V/,W) 

Eliminate-0-Degree(V/,W) 

RETURN(fy,W) 

Recall that moving a node of solid degree 0 or 1 to the opposite side improves 
the number of solid edges since all its incident dotted edges become solid and 
vice versa. As our algorithm works in parallel, we must pay attention in avoiding 
that adjacent vertices are moved in the same parallel step, because this fact could 
make useless the local improvement. Hence, our algorithm makes strong use of 
coloring procedures to check this independence property. In the following, we 
detail its main steps. 
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— Eliminate-0-Degree(V/,y^) 

The aim of this procedure is to eliminate 0-degree vertices from G by moving 
some of them to the opposite side. Observe that transferring an independent 
dominating set of Lq U Rq makes Lq = Rq = 0 and the solid degree of 
each moved vertex becomes 3. Thus Lemma 5 can be applied to find an 
independent dominating set whose vertices can be moved to the opposite 
side in a single parallel step. The procedure returns the updated sets Vi and 

y,. 

This procedure can be run on a CRCW PRAM model in O(logn) time using 
0(n) processors. 

— Make-Left-Side-Of-Degree-3( V/,V^) 

In order to eliminate all vertices of solid degree 1 or 2 in the left side, let 
us consider the subgraph induced by Li U L2 (by the previous algorithm 
step we can assume Lq = 0). If we consider a 3-coloring of such graph and 
move to Vr vertices of any two colors, we guarantee remaining vertices have 
degree 3. Since we are interested in making G as large as possible (remind 
\Es\ = 3 / 3 ), among all sets of vertices of any couple of colors we choose to 
move the less numerous one. A convenient 3 coloring can be found by means 
of the algorithm described in the proof of Lemma 4. 

This procedure can be run on a EREW PRAM model in 0(log n) time using 
0(n) processors. 

— Eliminate- 1-Degree- Prom-Right- Side ( V/ , ) 

Let us consider the subgraph induced by R\. After 3 coloring its vertices, we 
move those with color 1 to Vi in order to eliminate 1-degree vertices from 
the right side. Unlike the previous procedure, here we move only one color 
because we want to minimize the possible new vertices of degree 1 created in 
Vi (note that trying to move both a vertex colored 3 and its adjacent vertex 
colored 1 should imply that both of them are added to Li). Unfortunately, 
not moving vertices colored 3 does not guarantee to have Li empty at the 
end of the procedure: indeed, a vertex in Ls can become of degree 1 each 
time it is adjacent to two vertices in Vr colored 1. 

The running time of this procedure is the same as the previous one. 



Lemma 6. The execution of the procedure Eliminate- 1-Degree-From- Right- Side 
(Vi, Vr) increases the cardinality of Es by k, if k vertices are moved from R\ to 
Vi. 

Proof. The assertion follows from the fact that moved vertices are independent 
since all of them have color 1 and from the observation of Eigure 1(b). 



Lemma 7. During the execution of the procedure Eliminate- 1-Degree-Erom— 
Right- Side(Vi, Vr) at least ^ ^ vertices are moved from Vr toVi. 

Proof. Let us denote by Po and Pe the number of vertices in Ri involved in odd- 
and even- length paths in the subgraph induced by Vr^ where the length of a 
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path is defined to be the number of its edges. Analogously, Cq and denote the 
number of vertices in Ri involved in odd- and even-length cycles in the same 
subgraph. 

As the procedure Eliminate-l-Degree-From-Right-Side( V/, Vr) is concerned, 
we move from the right side to the left one the following number of vertices: 

— for each path of even length /, | vertices; therefore, totally at least Indeed, 

let A, ^2, • • • , be the lengths of even- lengths paths in the subgraph induced 
by Vr. Then, /i — 1, ^2 — 1, • • • , — 1 are the numbers of vertices of degree 1 

in such paths. Therefore, Yli=i = Pe + k. The number of moved vertices is 

Ehl I = iPe + fA: > ipe; 

— for each odd- length path of length /, ^ vertices; therefore - with reasonings 
similar to the previous ones - totally exactly 

— for each even-length cycle, exactly half of its vertices; therefore, totally ex- 
actly 

— for each cycle of odd length /, ^ vertices; therefore, totally exactly 

where k is the length of the i-th odd- length cycle. This sum 
is equal to ^ 

Hence, by summing up all these contributions, the number of vertices moved 
from right to left is at least ^ + ^ + ^ + ^ — ^ = ^ — 

The algorithm outputs a partition of the vertices of the graph into two sets 
Vi and Vr and the solution is the set Eg^ i.e. the set of edges connecting Vi with 
Vr. The next theorem guarantees the approximation ratio 1.3 for our algorithm. 

Theorem 8. The performance ratio o/ Parallel-Approx-Max-Cut (G) is 1.3. 

Proof. Observe that the size of the optimal maximum cut cannot exceed the 
difference between the number of edges and the maximum number of vertex 
disjoint odd- length cycles, i.e. |n — c < |n — c^. 

From the idea behind the algorithm and Lemma 7 it follows that the size of 
our approximate solution is at least 3/s + ^ 

Consequently, the approximation ratio of our algorithm is 




that is a function of c^, bounded by its maximum over the definition interval 
of Cr. 

This function always decreases since its derivative 

— 3/3 — ^ + f ^ 

( 3/3 + ^- tP 

is always negative in view of Lemma 1 and Lemma 2: 

v -\ 3 1 3 77 77 

- 3/3 ~ ~2 ^ 4^ ~ ~ 2^^^ ~ ^ 4 ^ - ~ 8 4 ^ 
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Considering the definition interval for given in Lemma 3, it is easy to prove 
that the maximum of our function holds for = 0. Furthermore, from Lemma 
2 and Lemma 1 we have the following chain of inequalities: 



R< 



|n 

3/3 + ^ 



< 



Tn 



< 



|n 



3 ^ 3 + V 2 n- 5/3) “ « + t 




4 Conclusions 

We presented a parallel approximation algorithm for Max Cut on cubic graphs 
that achieves a performance ratio equal to 1.3, substantially improving the best 
known parallel approximation ratio, i.e. 2, in the special case of cubic graphs. 

Starting from any bipartition of the vertices, the general strategy of our 
algorithm consists of increasing the number of edges in the cut by appropriately 
moving vertices from one side of the bipartition to the other. 

The algorithm manages with simple coloring procedures and can be efficiently 
implemented in O(logn) parallel time on a CRCW PRAM with 0(n) processors. 

We consider to be interesting to test the experimental behavior of this algo- 
rithm and to compare the quality of its solutions with the quality of the solutions 
found by the best known sequential algorithm on cubic graphs. Moreover, a gen- 
eralization of the approach to d-regular graphs should also represent a valuable 
contribution. 
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Abstract. In this paper, a new framework for rapid system design and 
implementation for fuzzy systems is proposed. The given system specifi- 
cation is separated into two components: a conceptual specification and a 
parameter specification. The conceptual specification defines the system 
core, which is rarely changed during the system adjustment and may be 
implemented in hardware at an early stage of the design process. The 
parameter specification defines parameters that are often changed and 
are implemented in software for easy adjustment. Such a partitioning ap- 
proach integrates both hardware and software capability. The presented 
methodology gives a rapid prototyping and efficient system tuning pro- 
cess which, therefore, can reduce the development cycle time and increase 
flexibility when designing fuzzy systems. 



1 Introduction 

In the design of application specific systems, the iterative process of detailing 
specifications, designing a possible solution, and testing and verifying the so- 
lution is repeatedly performed until the solution meets user requirements [3]. 
However, such a repeated cycle prolongs the system development time. The sys- 
tem life cycle is likely to be short due to rapid pace of improvements in current 
technology. The need for early prototyping, therefore, becomes critical to im- 
plement a system specification and provide customers with feedback during the 
design process. 

Recently, fuzzy systems have been popularly used in many industrial prod- 
ucts [17,9]. Since such systems are designed to mimic human knowledge, the 
performance of the systems is as good as knowledge given by experts. There- 
fore, a large number of real-time experiments are required in order to verify the 
system performance. 

Considerable work has been done on developing hardware implementations 
for general purposed fuzzy control systems. In [11,4], special hardware chips 
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have been invented to speedup the fuzzy inference engine. Other work has 
concentrated on design and implementation of fuzzy architectures and proces- 
sors [15,16,2]. Software fuzzy systems are flexible but fail to provide high-speed 
result [12]. Although general-purposed fuzzy hardware can yield high-speed out- 
put, they may be able to neither meet stringent real-time requirement of specific 
applications nor specific needs of the applications. Kan and Shragowitz pre- 
sented presented a generic development tool for fuzzy systems [7]. Their tools 
provide flexibility in defining membership functions, rules, and fuzzy logic oper- 
ations. Nevertheless, they do not focus on constructing hardware modules from 
the derived system. In [10], a computer-aided design tool for implementation of 
fuzzy controllers on FPGAs was proposed. However, they do not provide flexi- 
bility during a prototyping phase,. e.g., membership functions are implemented 
as a boolean circuit. Unlike the others, our research provides rapid prototyping 
framework for application specific fuzzy systems as well as efficient implementa- 
tion. 
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Fig. 1. Fuzzy system design flow 



Since one of the goals of using fuzzy logic is to reduce computational com- 
plexity of designing systems, e.g., fuzzy control systems, fuzzy logic rule bases 
consist of human knowledge expressed in rule format. Figure 1 shows the pro- 
posed design approach mapped to the existing fuzzy system design flow. Based 
on our methodology, the system specification is characterized into two subcom- 
ponents. The conceptual specification consists of the rule base portion since it 
defines system functionality and is seldom changed or required only a few mod- 
ification. In some classical fuzzy systems which have practical applications such 
as a temperature controller, and inverted pendulum, the rules can commonly 
be found [14] . For detailed fuzzy system tuning, varying parameters are consid- 
ered such as the range of system operation, the shape of membership functions. 
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type of fuzzy logic operations, defuzzification methods, etc. We collectively in- 
clude these portions in the parameter specification. Such a specification can be 
implemented in software for flexible tuning. 

In general, the proposed approach has the following benefits: 

1. rapid specification: the conceptual specification gives designers the ability to 
specify the overall concept of the system which is independent of parameter 
setting details. 

2. flexible tuning: the parameter specification and new system model allow de- 
signers to easily adjust the parameters and consider several possible solutions 
in order to meet user requirements. 

3. hardware/software capability: the integration of hardware prototype and flex- 
ible parameterized software naturally takes advantages of the ideas of hard- 
ware / software co-design. 

4. shorter develop nnent process: partitioning the system specification into two 
components allows the possibility of developing hardware and software in 
parallel and early prototyping. 

Our method will be discussed in the remainder of this paper. Section 2 
presents our system models. The synthesis methodology based on the models 
are discussed in Section 3. Section 4 presents an example of prototyping a fuzzy 
control system. Finally, Section 5 draws conclusion from this work. 

2 Terminologies 

A rule-based system consists of a collection of conditional statements such as 
If-Then rules. The variables in the if-part are called input variables while the 
variables in the then-part are called output variables. A particular value of an 
input/output variable is called an input/ output instance. In a fuzzy system, the 
value of these variables specified in the rules can be linguistic variables. 

In the following, we present the Conceptual State Graph presents state- 
oriented behaviors of the rules. The model is close to actual implementation 
of a rule base. 

Definition 1. A Conceptual State Graph (CSC) G = (S, l,0,E,a;) is a con- 
nected directed edge-weighted graph where S is a set of nodes Si e S^l < i < p, 
representing states of the graph, I is the set of inputs, 0 is the set of outputs in- 
cluding (j), E C 5 X 5 is a set of edges, denoted by Si sj or Cs^^Sj ; G S, sj G S, 
and uj is a function from E to \ x0, representing the set of edge weights, denoted 
by x/y, where x G I and ^ G 0. 

For the edge weight x/y of an edge Cu,v or u v , if y = then no output 
is given for the edge from u to v with the input x. 

As an example. Figure 2(b) shows a simple example of a rule base. In this 
rule base, variables x, y and z are input/output variables. The values of x, y and 
z can be L, M, H which stand for low, medium and high linguistic variables. 
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The rule base may represent a temperature control system whose input is the 
temperature obtained from an external sensor and whose output is an adjusting 
level for a fan speed. Input linguistic variables low, medium, and high are used 
to justify which rules to activate. A temperature value may be justified as a low 
or medium value as its meaning is fuzzy. Hence, two rules may be fired for each 
input. 

Based on the rules in Figure 2 (b), consider the CSG in Figure 2 (a). I 
contains all combinations of values for x and y according to the rules, that 
is I = {{xX),{x,M),{x,H),{y,H)}. Similarly, 0 = S = 

{so, si, 52,53} and E = e«o,s3, ^81,83} Edges to states S2 and S3 have 

output labels (y,L) and {y,H) respectively. The output label for the interme- 
diate edge (so,si) is (p. The edge weights are: a;(so,si) = {x,M) / S2) = 
(a;,L)/(2;,L),u;(so,S3) = {x,H)/{z,H),lo{si,S 2,) = {y,H)/{z,H). 



(x, H)/(z,H) 




(a) CSG 



If X = L Then z = L 

If X = M and y = H Then z = H 
If X = H Then z = H 



(b) Rules 



Fig. 2. System models 



Using this model, one can easily implement a system with multiple input 
variables. This can be advantageous since many current fuzzy processors allow 
only limited number of input variables. The CSG model can be transformed 
to an incompletely specified FSM and therefore, transformed to a completely 
specified FSM to which traditional FSM synthesis can be applied [ 8 ]. The CSG 
is similar to the incompletely specified FSM, based on an assumption that no 
unspecified next state is encountered. All unspecified outputs, 0 , convey “don’t 
care” outputs and the set of final states are the set of states where outputs are 
given. 

Note that one may view the CSG model similar to fuzzy automata where 
the state variable is fuzzy [ 6 ]. Given a certain input, two states may be fired 
simultaneously. Nevertheless, in this paper, we regard the CSG implementation 
as a traditional finite state machine rather than fuzzy finite state machine. De- 
veloping a hardware for fuzzy finite state machine is more complex and, more 
importantly, requires to embed membership functions representing fuzzy states. 
Therefore, membership functions are not explicitly separated from the CSG, i.e., 
the conceptual part. However, according to our approach, we intend to clearly 
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distinguish conceptual and parameter portions and establish the flexibility in 
tuning membership functions in the parameter part. Thus, we apply a traditional 
FSM synthesis to the CSG model for rapid rule base hardware development. 



3 Conceptual Specification Synthesis Algorithms 



In order to construct a hardware for a conceptual speciflcatiton, we first construct 
a CSG for a rule base as in the following steps. 

step 1 Convert the rule specification into a single-output rule format. A rule 
of the form R : If ... Then , . . . ,5^ has multiple output variables 
... Bi)). This form can be broken down into a group of rules, each of 
which has a single output variable, e.g., If . . . Then 5i, . . . , If . . . 
Then B^ 

step 2 Convert the rule into a conjunctive form^ e.g.. If Xi and X 2 and . . . 
and Xm Then Y. The disjunctive clauses A and B in the form: If A or 5 
Then C is broken down into two disjunctive rules: Ri : If A Then (7, and 
R 2 If B Then C . Further details of this conversion can be found in [5]. 
step 3 Construct a CSG using the following procedure. 

Algorithm 1 Construct^CSG 

Input: Rule base R = {vi}, a current starting state Sc 
Output: an update the current state graph A 

begin 

Lmax = Find-maxLabel{R) /* find the input var. with max occurrences */ 

if I / £NULL) 

then begin 

/* partition vertices into two groups P and Q */ 

/* P = vertices with input edge label Lmax */ 

Let P C R he a set of rules which contain Lmax 
Q = R-P; 

foreach p E P do 
begin 

^ Lmax is the only input for p 

then output = output (p), = Soutlabel /* output of p */ 

else output = (f), Sn = Sk, nonoutput jst ate = true fi 
if Sn 0 A.S 

then A.S = A.S U {sn} 
if escSn ^ A.E 

then A.E = A.E U } 

A.uj = A.uj U {1-max/ output} fi /* add weight to edge es^sn */ 

endfor od 

Ignore that input with label Lmax 
if |P| > 0 

then if nonoutput jst ate 

then Sj 2 Qxt ~ = ^ + 1? call Construct-CSG with P, Sj^ext 

if IQI > 0 

then call Construct-CSG with Q, Sc fi 
endif fi 

end 



1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 
27 



Without loss of generality, assume that all system inputs are available at the 
same time and each clause in the if- part of each rule is conjunctive. For each 
n-input rule, a collection of state nodes that form an n-edge path, with input 
labels according to the rule can be constructed. 
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In Algorithm 1, the initial state is represented by sq. All the rules are in set R. 
The initial call is Construct -CS G {R ^ sq) . The algorithm first counts the number 
of occurrences from each edge label. The maximum occurrence (edge label Lmax) 
is selected. Note that Lmax is a tuple (X, L) where X is an input variable and 
T is a linguistic variable with respect to the universe of X . The rules are then 
partitioned into two groups: P, containing rules that have an input Lmax and 
Q, containing rules without Lmax. The algorithm next examines every rule in 
P to determine if it has more than one input. 

If during the first iteration, a node with more than one input is found, a 
unique edge from the current state Sc to Sn is constructed. Note that for sim- 
plicity in defining a unique state index, two kinds of state indices are introduced. 
All states are indexed by an integer value except the output states where sym- 
bolic names are used. The state Sn in this case is the next non-output state k. 
In cases where p has only one input edge, Sn is initialized to an output state 
where output{p) is its output. This state is added to the CSG if it does not 
already exist, and an edge is constructed between Sc and Sn if necessary. Once 
all rules in group P have been considered, the input with Lmax are ignored 
from the set. The nonoutput state flag then determines if there was a rule with 
more than one input. If so, a new state was constructed during the 

algorithm. In this case, k is incremented and the algorithm is repeated for P 
with a new initial state, s^^ext- The algorithm is also repeated for Q using the 
old initial state. These two recursions stop after all inputs of a given set of rules 
are considered. 

After a CSG is established, a finite state machine (FSM) can simply derived. 
Before deriving a finite state machine, an FSM optimization may be applied. 
Then, the inference process can begin using the hardware rule base. 

According to the fuzzy inference process, a special calculation is required to 
determine the output strength of each rule given the associated input strengths. 
We assume min operation is used for computing fuzzy rule strength. Let pxij {xj) 
denote a membership function corresponding to the linguistic variable Xij and 
the current input instance Xj for rule i. Let p^Yi{y) be a membership function 
corresponding to the linguistic variable Y and output instance y for rule i. Sup- 
pose a rule requires m inputs. Given input instance xi,... ,Xn, for each rule 
i. Equation 1 determines a modified output function pyXy) that is used in the 
defuzzification process 



LVi 




Ly, {y) 
n 



if py. (y) < Vi 
otherwise 



( 1 ) 



where the rule strength r^, calculated by Equation 2, is the limit of the output 
strength Yi. 



n = (2) 

If more than one rules give the same linguistic variable output M and therefore 
generates a new function //y. (^), each Py^y) is combined by using the max 
operation. 
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fJ-Yi {y) = maxvn=y, (/iy^ (y)) (3) 

Based on the above equations and the given CSG, the following algorithm 
presents the overall inference process. 

Algorithm 2 InferenceCSG input: input instances / = , t 2 , . . . , tn} 

Output: output value o 

1 begin 

2 foreach Op where p is the output linguistic variables do Op = 0 od 

3 foreach ti E I do 

4 foreach I E Li do / / for every linguistic variable of input ti 

5 compute pi {ti) od od 

7 foreach tuple (jU(xi ,q)(^i), L(x 2 ,i 2 ) Ps), • • • , P(xn ,Zn) V/i G Li , ^2 G L 2 , . . . ,ln^Ln do 

8 r = min(/i(^^^i^)(tl),//(^2h2)(^2), . . . , L(xn 

9 q = compute CSG using (ti), ^ 2 ) (^ 2 ), • • • , L(xn,in)(^ri) 

10 if g is not don’t care variable cf> then Oq = max(Oq, r) fi 

11 odendfor 

12 o = defuzzify ( {Op : Op is the output linguistic variables} ) 

13 end 

Let {p} be a set of output linguistic variables. Op refers to a register which 
holds a current cut value for output linguistic variable p. Initially, they are set 
to zero. Let Li be a set of linguistic variables for input Xi and ti be an input 
instance for input X{. In Algorithm 2, Lines 3-5 fuzzify all inputs. After that 
all fuzzified inputs are fed to the CSG which will direct corresponding output 
values (Line 9). Meanwhile, the minimum value between these fuzzified values is 
calculated and will be accumulated if the output is fired in Line 10. 



4 Design Case Study 

We now consider designing a temperature control system for the oven found 
in [13]. The control system has two temperature input variables x\ and X 2 and 
outputs a new temperature u to which the oven temperature should be adjusted. 
Figure 3 shows the set of rules which are parts of conceptual specification. 



1 . If xi = low (xi, L) and X2 = low (x2, L) Then u = high (u, H) 

2 . If Xi = low (xi, L) and X2 = medium (x2, M) Then u = medium (u, M) 

3 . If Xi = low (xi, L) and X2 = high (x2, H) Then u = low (u, L) 

4 . If Xi = medium (a?i,M) and X2 = low (x2,L) Then u = high (u, H) 

5 . If Xi = medium {x\, M) and X2 = high (x2, H) Then u = low (u, L) 

6 . If x\ = high {x\,H) and X2 = low (x2,L) Then u = high (u, H) 

7 . If Xi = high (xi, H) and X2 = medium (x2, M) Then u = medium (u, M) 

8 . If x\ = high (xi, H) and X2 = high (x2, H) Then u = low (u, L) 



Fig. 3. Rule set in oven example 



In Figure 3, the tuple next to each clause is a shorthand notation of the 
clause’s linguistic variable. Figure 4 shows the CSG obtained from Algorithm 1 
for the above rules. S is {^o, si, S2, 83 , transitions to 

states S{u,L)^ S(^u,h) output temperature values medium^ low^ and high 

respectively. 
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(x2,M)/(u,M) 




Fig. 4. Behavioral network and CSG 




Since the rule base is well-established, Figure 5 shows an ASIC implementa- 
tion of the CSG in Figure 4 . In general, one may use rapid prototyping technology 
such as an FPGA (Field Programmable Gate Arrays) to implement a set of rules. 
In Figure 5 , linguistic variables {x \ , L ) , {x \ , M ) , {x \ , H ) , {x2 , A) , {x2 , M ) , {x2 , H) 
are represented by 000,001,010,011,100 and 101 while output variables 
{u^H)^{u^M)^{u^L),(j) are encoded as 00 , 01,10 and 11 respectively. The ini- 
tial state So is encoded as 0 . States si,S2, and S3 are encoded as 1,2, and 3 
respectively. After optimization, states S(^u,h)i ^{u,m) S(^u,l) can be merged 

with the initial state to save one control step and be ready to accept the next 
input. Thus, we only use 2 bits to encode existing states in this design. Any 
sequence of inputs that does not fire any rule will lead to the initial state. Fig- 
ure 6 partially demonstrates the behavior of the implementation. For example. 
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^XXXr ^100 t t^OOl t ^iOl t ^QOQt t^lQQ t t t^QlQ t ^111 t 



0.0 240.0 480.0 720.0 950.0 1200.0 1440.0 1580.0 1920.0 

Time (ns) 

Fig. 6. Simulation results of circuits in Figure 5 



in the case of {xi^L) and (x 2 ,M), the circuit receives 0 and 4 respectively and 
produces 1 ((it, M)) as output. If 2 and 3 are fed to the circuit, it will produce 
output 0 which is equivalent to rule 6 ( If and {x2^L) Then {u^H)). 

In this design, the parameter specification includes membership functions 
and defuzzification approaches. Since all input and output variables are temper- 
atures, the same membership functions for all variables are used. The tempera- 
ture domain is restricted to between 0 and 500 degrees centigrade. Figure 7(a) 
presents sample codes for these membership functions in C language. 

For defuzzification method, we have a choice of using centroid method (z* = 
weighted average method {z* = [13]. The codes of these 

methods are shown in Figure 7(b). 

Figure 8 shows the integration of the conceptual and parameter components 
and their data paths for computing outputs. We use Mentor graphics tool [1] to 
synthesize this circuit. The grey-dashed components, fuzzification and defuzzi- 
fication, are parameterized components where the others are fixed components. 
For easy implementation, the membership values are scaled from real values rang- 
ing in [0, 1] to integer values between [0,256]. The parameterized part presented 
in Figure 7 are developed in VHDL codes. Input feeder components sequentially 
feeds proper inputs to CSG rules (whose implementation is shown in Figure 5). 
The implementation of this components is simply a series of multiplexers with 
different inputs. Since the CSG processes one input at a time, we use only one 
min component to iteratively compute a rule strength. The output of the CSG is 
applied to control the proper accumulation of the cut on output linguistic vari- 
ables. The component on the left-bottom in this picture represents the feedback 
calculation where control action u together with inputs are used to calculate the 
new inputs for the next cycle. In this example, we use the following equation. 



Xi{t + 1) = —2xi{t) -h X 2 {t) + u{T), X 2 {t + 1) = Xi{t) - X 2 {t) 
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float triafCfloat VAR 1) 

float oatpat=0f 

switch (1) { 
case L ^ 

if (:*: <= 70) output =1 ; 
else if (:*: >= 210) output = Of 
else output = -0.007* + l.bf 'broalt f 

ease M i 

if (:*: >= 3b0 | | :*: <= 70 ) output = Of 
else if (:*: <= 210) output = 0.007*:*: -0.47f 
else output = -0.007*:*: +2.47f 'broaltf 
ease H ^ 

if (:*: >= 3b0) output = If 
else if (:*: <= 210 ) output = Of 
else output = 0 . 007*:*:-! .4b f 

} 

rotarH oatpat f 

> 

float bollaf (float :*:, VAR 1 ) 

float oatpat=0f 
switch (1) { 
ease L ^ 

if (:*: <= 140) oatpat =lf 

else oatpat = l/( pow((l+f abs ((:*:-70)/320) ) , 8) )f 'breah f 

case M ^ 

oatpat = l/( pow((l+fabs((:*:-210)/320)) ,8) )f breahf 
case H ^ 

if (:*: >= 3b0) oatpat = If 

else oatpat = l/( pow((l+fabs((:*:-3b0)/320)) ,8) )f breahf 

} 

retara oatpat f 

> 



float weighted_a7erage (float :*: , float y, float z ) 
float oatpat f 

oatpat = (70*:*: +210 *y + 3bb*z)/(:*:+y+z) f 
retara oatpat f 

> 

float ceatriod (float :*: , float y, float z, float (*®f) (float, VAR )) 



float poiat [3] ,all , sa®_of _prod=0 , sa®=0f 
/* iategral discretizatioa */ 
for (i=0f i <=b00f i++) { 

poiatCO] = ®ia((*®f)( (float) i,L ) , r) f 

poiatCl] = ®ia((*®f)( (float) i,!1 ) , y) f 

poiat[2] = ®ia((*®f)( (float) i,H ) , z) f 

all = ®ar (poiat [0] ,®ar (poiat [1] ,poiat[2]))f 
sa®_of_prod += i*allf 

} 

retara sa®_of_prod/sa® f 



(b) Defuzzification methods (3 inputs) : 
weighted average and centroid 



(a) Membership functions for triangular 
and bell curve 



Fig. 7. Sample codes for fuzzifier and defuzzifier 



fuzzification input feeder 




Fig. 8. Schematic of the oven example 
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Figure 9 presents one cycle of the simulation where initial xi = 90 and X 2 = 
100. Assume that membership function (triangular) in Figure 7(a) is used and 
the weighted average is a defuzzification method. The first seven steps devote to 
the fuzzification part which fuzzifies /xl(90) = 203,//m(90) = 10 and //^(lOO) = 
183,/iM(100) = 36. After all fuzzified values are ready, the rule strength can 
be calculated. This example takes 24 control steps to compute output u = 331. 
The feedback calculation results in x\ =251 and ^2 = — 10 for the next cycle. 
Based on this design, one can easily adjust the parameter parts, re-insert into 
the components and redo the simulations. Once the parameter is finalized, the 
prototype can be set-up rapidly. 

5 Conclusion 

One of the most important issues in system design is to make the design easy to 
adjust for system testing and rapid implementation. In this paper, we propose a 
new design methodology for fuzzy systems which partitions system components 
into conceptual and parameterized specifications. The conceptual specification 
defines the core of system specification which is rarely changed and may be 
implemented as a hardware prototype to implement a fast prototype. Param- 
eterized components are specified in software in order to be easily edited for 
system modification. This method actually integrates hardware and software 
capability, yielding flexibility in re-specification and shorten prototyping time. 
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Figure 9. One cycle simulation result of the schematic in Figure 8 
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Abstract. In mobile client-server database systems, caching of frequent- 
ly accessed data is an important technique that will reduce the contention 
on the narrow bandwidth wireless channel. As the server in mobile en- 
vironments may not have any information about the state of its clients’ 
cache(stateless server), using broadcasting approach to transmit the up- 
dated data lists to numerous concurrent mobile clients is an attractive 
approach. In this paper, a caching policy is proposed to maintain cache 
consistency for mobile computers. The proposed protocol adopts aperi- 
odic broadcasting as the cache invalidation scheme, and supports transac- 
tion semantics in mobile environments. With the aperiodic broadcasting 
approach, the proposed protocol can improve the throughput by reduc- 
ing the abortion of transactions with low communication costs. We study 
the performance of the protocol by means of simulation experiments. 



1 Introduction 

Mobile computing enables people with unrestricted mobility. It can satisfy peo- 
ple’s information needs at any time and in any place. Due to the recent devel- 
opment of the hardware such as small portable computers and wireless com- 
munication network, data management in mobile computing environments has 
become an area of increased interest to the database community [2, 6, 9, 21]. 

In general, the bandwidth of the wireless channel is rather limited. Thus, 
caching of frequently accessed data item in a mobile computer can be an effec- 
tive approach to reducing contention on the narrow bandwidth wireless chan- 
nel[10, 14,24]. However, once caching is used, a cache invalidation strategy is re- 
quired to ensure the cached data in mobile computers are consistent with those 
stored in the server [13, 16, 25]. 

Several proposals have appeared in the literature regarding the support of 
transactions in mobile systems [1,1 1,15, 26]. However, most of these approaches 
didn’t attempt to make use of a common feature in wireless systems: the ability 
that the server has to broadcast information to the mobile clients. Because the 
server in mobile environments may not have any information about the state of 
its clients’ cache(stateless server), using the broadcasting approach to transmit 
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Fig. 1. The mobile database system architecture 



the updated data lists to numerous concurrent mobile clients is an attractive 
approach [8]. 

Considering these limitations of the mobile environments, Barbara and 
Imielinski proposed an approach that a server periodically broadcasts an in- 
validation report that reports the data item which have been updated[7,8]. This 
approach is attractive in mobile environment because a server need not know 
the location and the connection status of its clients, and because the clients need 
not establish an uplink connection to invalidate their cache. 

In this paper, we present a protocol that adopts aperiodic broadcasting, to 
reduce the wait time of a transaction that has requested commit, and to reduce 
the abortion of transactions that may show conflicts in the periodic broadcasting 
strategy. With aperiodic broadcasting approach, the protocol can reduce the 
number of broadcasting occurrence under high ratio of updates, and can reduce 
the abortion of transactions under low ratio of updates. 

The remainder of this paper is organized as follows. Section 2 introduces 
the mobile transaction model. In section 3, we describe and discuss our caching 
protocol. Section 4 presents experiments and results, and Anally we state our 
concluding remarks in section 5. 

2 The Mobile Transaction Model 

Figure 1 presents a general mobile database system model. In this model, both 
a database server and a database are attached to each flxed host[12,22]. Users 
of the mobile computers may frequently query databases by invoking a series 
of operations, generally referred to as a transaction. Serializahility is widely 
accepted as a correctness criterion for execution of transactions [3,4, 17,20]. This 
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correctness criterion is also adopted in this paper. A database server is intended 
to support basic transaction operations and as resource allocation, commit, and 
abort. 

Each mobile support station(MSS) has a coordinator which receives trans- 
action operations from mobile hosts and monitors their execution in database 
servers within the fixed networks. Transaction operations are submitted by a 
mobile host to the coordinator in its MSS, which in turn sends them to the 
distributed database servers within the fixed networks for execution[2,21]. 

In general, there are two ways to structure a mobile transaction processing 
system[26]; the first situation is that the mobile host behaves like a remote I/O 
device, and the data must be placed in the static part of the network [9, 14]. This 
situation arises when the amount of resources on the mobile device is small. 
Secondly, we can consider a mobile host as a full fledged server to manage trans- 
action, and place data locally in mobile hosts [1,1 1,18]. 

In this paper, we adopt an way between these two situation to structure a 
mobile transaction processing system. That is we still keep the data locally, but 
we treat it as a cache rather than as a primary copy. Thus, as shown in figure 
1, each mobile host can have its own cache to maintain some portion of data for 
later reuse. The transaction operation that accesses the data item stored in the 
cache can be processed without the interaction with the database server. If the 
data item does not exist in the cache of the mobile host, it sends an message to 
the server to download the appropriate data item. 

3 Our Protocol 

In this section, we present our concurrency control protocol, which adopts ape- 
riodic broadcasting approaches, supporting the stateless server. We assume that 
there is a central server that holds and manages all data. Also, we assume that 
only one transaction may be initiated by a mobile client at any time. That is, a 
mobile client can initialize a transaction only after the previous transaction has 
finished. 



3.1 Aperiodic Broadcasting 

The proposed protocol adopts broadcasting approach to maintain cache consis- 
tency, and to control the concurrent transactions. With the broadcasting ap- 
proach, the mobile client sends the commit request of the transaction after ex- 
ecuting all operations, to install the updates in the central database. Then the 
server decides commit or abort of the requested transactions, and notifies this 
information to all of the clients in its cell with the broadcasting invalidation re- 
ports. The broadcasting strategy does not require high communication costs, nor 
require the server to maintain additional information about the mobile clients 
in its cell, thus is attractive in mobile databases [8,26]. 

Some proposals have appeared in the literature regarding the use of the 
broadcasting for the control of the concurrent transactions, and all of these 




Transactional Cache Management with Aperiodic Invalidation Scheme 



53 



approaches adopt synchronous (periodic) manner as the way of broadcasting the 
invalidation reports[7,8]. In these schemes, the server broadcasts the list of invali- 
dating data items and the list of transactions that will be committed among those 
which have requested commit during the last period. These schemes present some 
problems which arise with the synchronous manner of broadcasting approach. 

— if two or more conflicting transactions have requested commit during the 
same period, only one of them can commit and others have to abort. 

— the mobile client is blocked on the average for the half of the broadcasting 
period until it decides commit or abort of the transaction. 

In this paper, we adopt aperiodic broadcasting approach as the way of sending 
the invalidation report. Unlike the schemes using periodic broadcasting, in our 
scheme, invalidation reports are broadcasted immediately after a commit request 
arrives. Using the aperiodic broadcasting approach, most transactions that may 
show conflicts in the same period by synchronous broadcasting, can avoid them, 
thus the protocol can reduce the abortion rate of transaction processing. Also, 
the blocking time of the transaction that have sent the commit request can be 
reduced, as the server immediately broadcasts the invalidation reports which 
notifles commit or abort. 

3.2 The Protocol 

Our protocol uses a modifled version of optimistic control[5,17,19,27], in order 
to reduce the communication costs on the wireless network. With optimistic 
approach, all operations of a transaction can be processed locally using cached 
data, and at the end of the transaction, the mobile client sends the commit 
request message to the server. Then the server immediately broadcasts the in- 
validation report including the data items that should be invalidated, and the 
mobile client identifler. Receiving invalidation report, the invalidating action 
preempts the ongoing transaction that shows conflicts with the now-committed 
transaction. With this approach, the mobile client can early detect the conflicts 
of a transaction that should be aborted later at the server. 

All data items in the system are tagged with a sequence number which 
uniquely identifles the state of the data. The sequence number of data item 
is increased by the server, when the data item is updated. The mobile client in- 
cludes the sequence number of its cached copy of data(if the data item is cache 
resident) along with the commit request message. 

We now summarize our protocol for both the mobile client and the server. 



Mobile Client Protocol: 



— Whenever a transaction becomes active, the mobile client opens two sets, 
read_set and write _set. Initially, the state of a transaction is marked as read- 
ing. Data item is added to these sets with the sequence numbers, when the 
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transaction requests read or write operation on that data item. The state of 
the transaction is changed in updating state, when the transaction requests 
write operation. 

— Whenever the mobile client receives an invalidation report, it removes copies 
of the data items that are found in the invalidating list. And, if any of the 
following equations becomes true, the transaction of reading state is changed 
into read-only state, and the transaction of updating transaction is aborted. 

read_set H invalidatinglist ^ 0 
write_set H invalidatinglist ^ 0 

— When a transaction of read-only state requests write operation, the mobile 
client aborts the transaction. 

— When a transaction is ready to commit, the mobile client commits the trans- 
action of reading state or read-only state locally. If the state of the transaction 
is updating^ the mobile client sends a commit request message along with the 
read_set, write ^et and the identification number of the mobile client. 

— After that, the mobile client listens to the broadcasting invalidation report 
which satisfies any of the following equation. 

read_set H invalidatinglist ^ 0 
write_set H invalidatinglist ^ 0 

If the invalidation report, satisfying above equations, is attached with the 
identification number of this mobile client, the transaction is committed. 
Otherwise, the mobile client aborts the transaction, and removes copies of 
the data items that are found in invalidating list. 



Server Protocol: 



— Whenever the server receives a commit request from a mobile client, and if 
sequence numbers of all data items in read_set and write _set are identical 
with the server’s, it broadcasts the invalidation report along with the in- 
validating list, which is the list of data item in the write _set of the commit 
request, and identification number of the mobile client. Otherwise, the server 
just ignores the commit request. 

The protocol described above adopts aperiodic broadcasting, thus the server 
immediately broadcasts invalidation reports, when it receives a commit request 
from a mobile client. Our protocol has some advantages with the aperiodic ap- 
proach. At first, when write operations are infrequent at mobile clients, the pro- 
tocol can reduce the communication costs by broadcasting invalidation reports 
only when updating transaction occurs. It is unnecessary to send invalidation 
reports periodically without data items that should be invalidated. On the other 
hand, when updating transactions occur frequently, the protocol can avoid many 
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Fig. 2. Example execution 



aborts, by reducing the conflicts between updating transactions. In synchronous 
broadcasting approach, when two or more updating transactions are conflicting 
in a period, only one of them can commit, as invalidation reports are broadcasted 
once for a period. Our protocol can avoid much of these aborts, because mobile 
clients are informed the list of updated data items immediately. 

In this case, the increased number of broadcasting is not an additional com- 
munication overhead that may degrades the throughput of transaction process- 
ing, as the server initiates only one broadcasting for a committing transaction. 
As shown in the above protocol, no notiflcation is required for an aborted trans- 
action. The server just ignores the commit request with sequence number which 
has fallen behind the server’s number. 

Example execution under the protocol is shown in Eigure 2. This example 
uses two mobile clients and a server. In this example, r(x) and w{x) denote a 
read and write operation performed by a transaction on data item x. 

In (a), the transaction initiated in mobile client 2 can be committed by the 
server, as it can read the data item x that has been updated by the transaction 
in mobile client 1, by the immediate invalidation report. With the synchronous 
broadcasting approach, if the commit requests of the two transactions arrive in 
the identical period, both transactions cannot be committed. In (b) and (c), the 
state of the transaction in mobile client 2 is changed from reading state to read- 
only state, when the mobile client receives the broadcasting invalidation report. 
In case of (b), the transaction can commit locally, as the state of the transaction 
remains read-only state, when the transaction is ready to commit. However, in 
(c), the transaction will be aborted, because it requests write operation in read- 
only state. In example (d), the mobile client 2 sends the commit request slightly 
before it receives the invalidation report that indicates the updates of the data 
item X. The server ignores the commit request, as the sequence number of data 
item X in the message is lower than the server’s. No broadcasting message is 
required for the aborted transaction, and the transaction in mobile client 2 is 
aborted, receiving the invalidation report of data item x, with the identiflcation 
of mobile client 1. Thus, in our protocol, the server initiates one broadcasting 
for each commit of transactions. 
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4 Performance 

In this section, we develop the simulation model and present the results of ex- 
periments, in order to evaluate the performance of the proposed algorithm. 



4.1 Simulation Model 

This section describes our simulation model. We can divide our simulation model 
into three parts: database model, transaction model, and system model. The 
database model captures the characteristics of the database, such as the database 
size and object attribute. The transaction model captures the data object ref- 
erence behavior of transactions in a workload. And the system model captures 
the characteristics of the system’s hardware and software. Figure 1 in section 2, 
shows the physical structure of the modeled system. Modeled system is composed 
of a database server and mobile clients, connected by a wireless network. 



Database Model Database is composed of multiple data objects which have 
the same attributes. There are two attributes for each data object: identifier and 
sequence number. The Database model parameters are summarized in Table 1. 
The number of objects in the database, Nobject^ was chosen to be relatively small 
in order to model contention. For caching clients, the cache size, CachePercent^ 
is a constant. The content of each client’s cache is fixed at the start of the 
simulation by choosing objects uniformly from the database. 



Table 1. Database parameters 



Parameter 


Meaning 


N Object 


Number of objects in the database 


CachePercent 


Percentage of cache size 



Transaction Model The transaction model supports the following operations. 

— ReadOhject : Reads an object from its client cache. If the object does not exist in 
the cache, reads it from the server database. 

— WriteOhject : Updates an object in the client cache and increase the sequence 
number. 

— CommitTR : Commit a transaction. 

— AbortTR : Abort a transaction. 

A transaction is modeled by a finite loop of ReadObject and WriteObject 
operations, which are followed by CommitTR or AbortTR operation. Table 2 
summarizes the parameters that characterize a transaction type. The number 
of ReadObject and WriteObject operations in a transaction is called TRSize^ 
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which is uniformly distributed between MinTRSize and MaxTRSize. The pa- 
rameter ProhWrite indicates the probability that WriteObject operations occur 
in a transaction. The delay parameters are exponentially distributed delay times 
used to model interactive system. 



Table 2. Transaction parameters 



Parameter 


Meaning 


TRSize 


Number of operations in a transaction 


Prob Write 


Probability that write operation occurs 


ReadDelay 


Average delay of a read operation 


WriteDelay 


Average delay of a write operation 


TRState 


State of transactions that are processed 



System Model The system model consists of a network manager and clients 
and server modules. The parameter for all the modules are summarized in Table 



3. 



Table 3. System Parameters 



Parameter 


Meaning 


N Client 


Number of mobile clients 


NetDelay 


Average communication delay on the wireless network 


DBAeeessDelay 


Average delay to access database 


C ache Aeeess Delay 


Average delay to access cache 


ReadHitProb 


Probability of read hit at mobile client 



4.2 Experiments 

In this section, we present results from several performance experiments in- 
volving the protocol described in section 3. The main performance metric pre- 
sented is system throughput^ measured in committed transactions per second. 
The throughput results are, of course, dependent on the particular settings cho- 
sen for the various physical system resource parameters. Thus while the through- 
put results show performance characteristics in what consider to be a reasonable 
environment, we also present other performance measures, the number of mes- 
sages and the percentage of aborts^ to provide additional insights into the funda- 
mental tradeoffs between protocols. Table 4 shows the values of the simulation 
parameters used for these experiments. 
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Table 4. Simulation Parameter Settings 



Parameter 


Value 


N Object 


1,000 


CachePercent 


5% 


TR Size (Max) 


15 


TRSize(Min) 


3 


Pro Write 


0, 2.5, 5, 7.5, 10, 12.5, 15, 17.5, 20, 22.5, 25% 


ReadDelay 


10 ms 


WriteDelay 


40 ms 


N Client 


20 


NetDelay 


20 ms 


DBAcessDelay 


50 ms 


CacheAccess Delay 


10 ms 


ReadHitProb 


0.5 



4.3 Results 

In this section, we present the results from several performance experiments. We 
ran a number of simulations to compare the behavior of the proposed protocol 
and the protocol with periodic broadcasting. 



Figure 3 shows the required number of messages to commit a transaction 
with the proposed protocol and the synchronous one. The message counts of 
each protocol increases in a sublinear fashion in whole range of update ratio. 
When the update ratio is lower than about 13%, our protocol using aperiodic 
broadcasting approach requires less messages than the synchronous protocol, 
because broadcasting is rare with low ratio of updating transactions. If the period 
is longer than 13%, the frequently broadcasted invalidation reports are the main 
factor that cause our protocol to require more messages. In the synchronous 
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Fig. 3. Impact of updating ratio on message counts 
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Fig. 4. Impact of updating ratio on aborts of transactions 
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Fig. 5. Impact of updating ratio on throughput 



protocol, the message costs does not increase so rapidly as our protocol, because 
the invalidation reports are broadcasted periodically. 

Figure 4 presents the percentage of aborts of transaction that take place 
both at the client side and at the server. As can be seen, when updating ratio 
increases, the percentage gets to be significant. When the updating ratio is lower, 
most aborts are due to the transaction’s read operation on the stale cached data. 
It occurs when an invalidation report is not delivered by disconnection, or when 
a mobile client requests commit of a transaction before it receives the invalida- 
tion report which include the data item referenced by the transaction. As the 
updating ratio gets higher. Aborts increase with both protocols, mainly because 
of the increased conflicts between transactions. In case of the synchronous pro- 
tocol, the percentage increases more rapidly, as the number of transactions that 
show conflicts in the same broadcasting period increases. 

Figure 5 shows the throughput results for two protocols. As shown in this 
figure, the throughput degrades with increasing write operations, because of the 
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message costs for updating transactions. When the updating ratio is very low, 
two protocols show almost the same performance, because there are few transac- 
tions that are aborted by the server, in both cases. However, the throughput of 
the synchronous protocol degrades more rapidly as the updating ratio increases, 
because of increasing aborts of transactions. With synchronous approach, lots 
of aborts caused by conflicts may happen, as the server checks the conflicts be- 
tween transactions once in a period. In our protocol, most of such aborts can 
be avoided by broadcasting invalidation reports immediately after receiving a 
commit request. 

5 Conclusions 

We have presented a new protocol to support transactions in mobile client- 
server environments. The proposed protocol adopts broadcasting cache inval- 
idation strategy to support the stateless server scheme, a common feature in 
mobile environments. In this paper, we adopt aperiodic approach as the way of 
broadcasting invalidation reports. With this approach, our protocol can reduce 
communication costs when updating is rare, and can avoid most aborts of adja- 
cent conflicting transactions without additional communications on the wireless 
network. Simulations were conducted to evaluate the performance of the pro- 
tocol. Our performance experiments show that relative merit of the proposed 
protocol under various updating ratio. 
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Abstract. Pure functional programming languages have been proposed 
as a vehicle to describe, simulate and manipulate circuit specifications. 
We propose an extension to Haskell to solve a standard problem when 
manipulating data types representing circuits in a lazy functional lan- 
guage. The problem is that circuits are finite graphs - but viewing them 
as an algebraic (lazy) datatype makes them indistinguishable from po- 
tentially infinite regular trees. However, implementations of Haskell do 
indeed represent cyclic structures by graphs. The problem is that the 
sharing of nodes that creates such cycles is not observable by any func- 
tion which traverses such a structure. In this paper we propose an ex- 
tension to call-by-need languages which makes graph sharing observable. 
The extension is based on non updatable reference cells and an equality 
test (sharing detection) on this type. We show that this simple and prac- 
tical extension has well-behaved semantic properties, which means that 
many typical source-to-source program transformations, such as might 
be performed by a compiler, are still valid in the presence of this exten- 
sion. 



1 Introduction 

In this paper we investigate a particular problem of embedding a hardware de- 
scription language in a lazy functional language - in this case Haskell. The 
“embedded language” approach to domain-specific languages typically involves 
the designing a set of combinators (higher-order reusable programs) for an ap- 
plication area, and by constructing individual applications by combining and co- 
ordinating individual combinators. See [Hud96] for examples of domain-specific 
languages embedded in Haskell. In the case of hardware design the objects con- 
structed are descriptions of circuits; by providing different interpretations of 
these objects one can, for example, simulate, test, model-check or compile cir- 
cuits to a lower-level description. For this application (and other embedded de- 
scription languages) we motivate an extension to Haskell with a feature which we 
call observable sharing^ that allows us to detect and manipulate cycles in data- 
structures - a particularly useful feature when describing circuits containing 
feedback. Observable sharing is added to the language by providing immutable 
reference cells, together with a reference equality test. In the first part of the 
paper we present the problem and motivate the addition of observable sharing. 
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A problem with observable sharing is that it is not a conservative extension of 
a pure functional language. It is a “side effect” - albeit in a limited form - for 
which the semantic implications are not immediately apparent. This means that 
the addition of such a feature risks the loss of many of the desirable semantic 
features of the host language. O’Donnell [0’D93] considered a form of observable 
sharing (Lisp-style pointer equality eq) in precisely the same context (i.e., the 
manipulation of hardware descriptions) and dismissed the idea thus: 

“ This (pointer equality predicate) is a hack that breaks referential trans- 
parency, destroying much of the advantages of using a functional lan- 
guage in the first place.” 

But how much is actually “destroyed” by this construct? In the second part of 
this paper we show - for our more constrained version of pointer equality - that 
in practice almost nothing is lost. 

We formally define the semantics of the language extensions and investigate their 
semantic implications. The semantics is an extension to a call-by-need abstract 
machine which faithfully reflects the amount of sharing in typical Haskell imple- 
mentations. 

Not all the laws of pure functional programming are sound in this extension. 
The classic law of beta-reduction for lazy functional programs, which we could 
represent as: let {x = M} in N = N[M/^] (x ^ M) does not hold in the theory. 
However, since this law could duplicate an arbitrary amount of computation (via 
the duplication of the sub-expression M, it has been proposed that this law is 
not appropriate for a language like Haskell [AFM+95], and that more restrictive 
laws should be adopted. Indeed most Haskell compilers (and most Haskell pro- 
grammers?) do not apply such arbitrary transformations - for efficiency reasons 
they are careful not to change the amount of sharing (the internal graph struc- 
ture) in programs. This is because all Haskell implemetations use a call-by-need 
parameter passing mechanism, whereby the argument to a function in a given 
call is evaluated at most once. 

We develop the theory of operational equivalence for our language, and demon- 
strate that the extended language has a rich equational theory, containing, for 
example, all the laws of Ariola et al’s call-by-need lambda calculus [AFM+95]. 

2 Functional Hardware Description 

We deal with the description of synchronous hardware circuits in which the be- 
haviour of a circuit and also its components can be modelled as functions from 
streams of inputs to streams of outputs. The description is realised using an 
embedded language in the pure functional language Haskell. There are good 
motivations in literature for being able to use higher-order functions, polymor- 
phism and laziness to describe hardware [She85, 0’D96, CLM98, BCSS98]. 

Describing Circuits The approach of modelling circuits as functions on streams 
was taken as early as in the days of /xFP [She85], and later modernised in systems 
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like Hydra [0’D96] and Hawk [CLM98]. The following introduction to functional 
circuit description owes much to the description in [0’D93]. 

Here are some examples of primitive circuit components modelled as functions. 
We assume the existence of a datatype Signal, which represents an input, output 
or internal wire in a circuit. 

inv : : Signal -> Signal and : : Signal -> Signal -> Signal 
latch : : Signal -> Signal xor : : Signal -> Signal -> Signal 

We can put these components together in the normal way we compose functions; 
by abstraction, application, and local naming. Here are two examples of circuits. 
One consists of just an and-gate and an xor-gate, which is used as a component 
in the other. 

half Add a b = (xor a b, and a b) 
fullAdd a b c = let (si, cl) = half Add a b 

(s2, c2) = halfAdd si c in (s2, xor cl c2) 

We use local naming of results of subcomponents using a let expression. The 
types of these terms are: 

halfAdd : : Signal -> Signal -> (Signal, Signal) 

fullAdd : : Signal -> Signal -> Signal -> (Signal, Signal) 

Here is a third example of a circuit. It consists of an inverter and a latch, put 
together with a loop, also called feedback. The result is a circuit that toggles its 
output. 

toggle : : Signal 

toggle = let output = inv (latch output) in output 
Note how we express the loop; by naming the wire and using it recursively. 

Simulating Circuits By interpreting the type Signal as streams of bits, and 
the primitive components as functions on these streams, we can run, or simulate 
circuit descriptions with concrete input. 

Here is a possible instantiation, where we model streams by Haskell’s lazy lists. 

type Signal = [Bool] — possibly infinite 

inv bs = map not bs and as bs = zipWith (&&) as bs 

latch bs = False : bs xor as bs = zipWith (/=) as bs 

We can simulate a circuit by applying it to inputs. The result of evaluating 
fullAdd [False, True] [True, True] [True , True] is [(False , True) , (True , 
True)] , while the result of toggle is [True, False, True , False, True, . . . 

As parameters we provide lists or streams of inputs and as result we get a stream 
of outputs. Note that the toggle circuit does not take any parameter and results 
in an infinite stream of outputs. The ability to both specify and execute (and per- 
form other operations) hardware as a functional program is a claimed strength 
of the approach. 

Generating Net lists Simulating a circuit is not enough. If we want to im- 
plement it, for example on an FPGA, or prove properties about it, we need to 
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generate a netlist of the circuit. This is a description of the all components of 
the circuit, and how they are connected. 

We can reach this goal by symbolic evaluation. This means that we supply vari- 
ables as inputs to a circuit rather than concrete values, and construct an ex- 
pression representing the circuit. In order to do this, we have to reinterpret the 
Signal type and its operations. 

A first try might be along the following lines. A signal is either a variable name 
(a wire), or the result of a component which has been supplied with its input 
signals. 

type Signal = Var String | Comp String [Signal] 

inv b = Comp "inv" [b] and a b = Comp "and" [a, b] 

latch b = Comp "latch" [b] xor a b = Comp "xor" [a, b] 

Now, we can for example symbolically evaluate half Add (Var "a") (Var "b") 
(Comp "xor" [Var "a", Var "b"] , Comp "and" [Var "a", Var "b"]) 
And, similarly a full adder. But what happens when we try to evaluate toggle? 
Comp "inv" [Comp "latch" [Comp "inv" [Comp "latch" . . . 

Since the Signal datatype is essentially a tree, and the toggle circuit contains a 
cycle, the result is an infinite structure. This is of course not usable as a symbolic 
description in an implementation. We get an infinite data structure representing 
a finite circuit. 

We encounter a similar problem when we provide inputs to the a circuit which 
are themselves output wires of another circuit. The Signal type is a tree, which 
means that when a result is used twice, is has to be copied. This shows that 
trees are inappropriate for modelling circuits, because physically, circuits have a 
richer graph-like structure. 

2.1 Previous Solutions 

One possible solution, proposed by O’Donnell [0’D93], is to give every use of 
component a unique tag, explicitly. The signal datatype is then still a tree, but 
when we then traverse that tree, we can keep track of what tags we have already 
encountered, and thus avoid cycles and detect sharing. 

In order to do this, we have to change the signal datatype slightly by adding a 
tag to every use of a component, for example as follows. 

data Signal = Var String | Comp Tag String [Signal] 

When we define a circuit, we have to explicitly label every component with a 
unique tag. O’Donnell then introduces some syntactic sugar for making it easier 
for the programmer to do this. 

Though presented as “the first real solution to the problem of generating netlists 
from executable circuit specifications [...] in a functional language”, it is awk- 
ward to use. A particular weakness of the abstraction is that it does not enforce 
that two components with the same tag are actually identical; there is nothing 
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to stop the programmer from mistakenly introducing the same tag on different 
components. 

But if explicit tagging is not the desired solution, why not let some underly- 
ing machinery guarantee that all the tags are unique? Monads are a standard 
approach for such problems (see e.g., [Wad92]). In functional programming, a 
monad is a data structure that can abstract from an underlying computation 
model. A very common monad is the state monad, which threads a changing 
piece of state through a computation. We can use such a state monad to gener- 
ate fresh tags for the signal datatype. This monadic approach is taken in Lava 
[BCSS98]. 

Introducing a monad implies that the types of the primitive components and cir- 
cuit descriptions become monadic, that is, their result type becomes monadic. A 
big disadvantage of this approach is not only that we must change the types, but 
also the syntax. We can no longer use normal function abstraction, local naming 
and recursion anymore, we have to express this using monadic operators. All 
this turns out to be very inconvenient for the programmer. 

What we are looking for is a solution that does not require a change in the nat- 
ural circuit description style of using local naming and recursion, but allows us 
to detect sharing and loops in a description from within the language. 

3 Proposed Solution 

The core of the problem is: a description of a circuit is basically a graph, but we 
cannot observe the sharing of the nodes from within the program. The solution 
we propose is to make the graph structure of a program observable, by adding a 
new language construct. 

Objects with Identity The idea is that we want the weakest extension that 
is still powerful enough to observe if two given objects have actually previously 
been created as one and the same object. 

The reason for wanting as weak an extension as possible is that we want to retain 
as many semantic properties from the original language as possible. This is not 
just for the benefit of the programmer - it is important because compilers make 
use of semantic properties of programs to perform program transformations, and 
because we do not want to write our own compiler to implement this extension. 

Since we know in advance what kind of objects we will compare in this way, we 
choose to be explicit about this at creation time of the object that we might end 
up comparing. In fact, one can view the objects as non- updatable references. We 
can create them, compare them for equality, and dereference them. 

Here is the interface we provide to the references. We introduce an abstract type 
Ref, with the following operators: 

type Ref a = ... ref : : a -> Ref a 

(<=>) : : Ref a -> Ref a -> Bool deref : : Ref a -> a 
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The following two examples show how we can use the new constructs to detect 
sharing: (i) let x = undefined in (let r = ref x in r <=> r) 

(ii) let X = undefined in ref x <=> ref x 

In (i) we create one reference, and compare it with itself, which yields True. In 
(ii), we create two dijf event lefeiences to the same variable, and so the comparison 
yields False. 

Thus, we have made a non conservative extension to the language; previously 
it was not possible to distinguish between a shared expression and two different 
instances of the same expression. We call the extension observable sharing. We 
give a formal description of the semantics in section 4. 

3.1 Back to Circuits 

How can we use this extension to help us to symbolically evaluate circuits? Let 
us take a look at the following two circuits. 

circl = let output = latch output in output 
circ2 = let output = latch (latch output) in output 

In Haskell’s denotational semantics, these two circuits are identified, since circ2 
is just a recursive unfolding of circl. But we would like these descriptions to 
represent different circuits; circl has one latch and a loop, where as circ2 has 
two latches and a loop. If the signal type includes a reference, we could compare 
the identities of the latch components and conclude that in circl all latches are 
identical, where as in circ2 we have two dijf event latches. 

We can now modify the signal datatype in such a way that the creation of 
identities happens transparently to the programmer. 

data Signal = Var String | Comp (Ref (String, [Signal])) 
comp name args = Comp (ref (name, args)) 

inv b = comp "inv" [b] and a b = comp "and" [a, b] 

latch b = comp "latch" [b] xor a b = comp "xor" [a, b] 

In this way, a circuit like toggle still creates a cyclic structure, but it is now pos- 

sible to define a function which observes this cyclicity and therefore terminates 
when generating a net list for the circuit. 

3.2 Other Possible Solutions 

We briefly discuss two other solutions, both of which more or less well known 
extensions to functional programming languages. 

Pointer Equality The language is extended with an operator (>=<) : : a -> 
a -> Bool that investigates if two expressions are pointer equals that is, they 
refer to the same bindings. 

In our extension, we basically provide pointer equality in a more controlled way; 
you can only perform it on references, not on expressions of any type. This means 
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we can implement our references using a certain kind of pointer equality. The 
other way around is not possible however, which shows that our extension is 
weaker. 

Gensym The language is extended with a new type Sym of abstract symbols 
with equality, and an operator that generates fresh such symbols, gensym. It is 
possible to define gensym in terms of our Refs, and also the other way around. 
With the reference approach however, by get an important law by definition^ 
which is: rl <=> r2 = True => deref rl = deref r2 

4 The Semantic Theory 

In this section we formally define the operational semantics of observable shar- 
ing, and study the induced notion of operational equivalence. For the technical 
development we work with a de-sugared core language based on an untyped 
lambda calculus with recursive lets and structured data. 

The language of terms, A^ef is given by the following grammar^: 

T, M, N ::= x \ Xx.M | Mx | let {5 = M} in TV | ref x | deref M \ M ^ N 

Note that we work with a restricted syntax in which the arguments in func- 
tion applications and the arguments to constructors are always variables (c.f, 
[PJPS96, PJS98, Lau93, Ses97]. It is trivial to translate programs into this syn- 
tax by the introduction of let bindings for all non- variable arguments. 

The set of values^ Val C ylref , ranged over by V and W are the lambda- 
expressions Xx.M. We will write let {5 = M} in as a shorthand for let {xi = 
Ml, . . . , Xn = Mn} in N where the x are distinct, the order of bindings is not 
syntactically significant, and the x are considered bound in N and the M (i.e., 
all lets are potentially recursive). 

The only kind of substitution that we consider is variable for variable^ with a 
ranging over such substitutions. The simultaneous substitution of one vector of 

variables for another will be written M[^/j], where the x are assumed to be 
distinct (but the V need not be). 

4.1 The Abstract Machine 

The semantics for the standard part of the language presented in this section 
is essentially Sestoft’s “mark 1” abstract machine for laziness [Ses97]. Following 
[MS99], we believe an abstract machine semantics is well suited as the basis for 
studying operational equivalence. 

Transitions in this machine are defined over configurations consisting of (i) a 
heap^ containing a set of bindings, (ii) the expression currently being evaluated, 

^ In the full version of the paper we also include constructors and a case expression, 
as well as a strict sequential composition operator. 
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and (iii) a stacks representing the actions that will be performed on the result of 
the current expression. 

There are a number of possible ways to represent references in such a machine. 
One straightforward possibility is to use a global reference- environment, in which 
evaluation of the ref operation creates a fresh reference to its argument. We 
present an equivalent but syntactically more economical version. Instead of ref- 
erence environment, references are represented by a new (abstract) constructor 
(i.e. a constructor which is not part of vlref ), which we denote by ref. 

Let = Aref U {ref X | x G Far}, and Valj^ = Val U (ref x | x G Var}. We 
write (T, M, S) for the abstract machine configuration with heap T, expres- 
sion M G A^ , and stack S. A heap is a set of bindings from variables to terms 
of we denote the empty heap by 0, and the addition of a group of bindings 
X = M to a heap T by juxtaposition: r{x = m}- 

A stack is a list of stack elements. The stack written h : S will denote the a 
stack S with h pushed on the top. The empty stack is denoted by e, and the 
concatenation of two stacks S and T by ST (where S is on top of T). Stack 
elements are either: 

— a variable x, representing the argument to a function, 

— an update marker indicating that the result of the current computation 
should be bound to the variable x in the heap, 

— a pending reference equality-test of the form (^ M), or (ref x ^), 

— a dereference deref , indicating that the reference which is produced by the 
current computation should be dereferenced. 

We will refer to the set of variables bound by T as dom T, and to the set of 
variables marked for update in a stack S as dom 5. Update markers should 
be thought of as binding occurrences of variables. Since we cannot have more 
than one binding occurrence of a variable, a configuration is deemed well-formed 
if domT and dom 5 are disjoint. We write dom(T, A) for their union. For a 
configuration ( T, M, S) to be closed, any free variables in T, M, and S must 
be contained in dom(T, S). 

For sets of variables P and Q we will write P A Q to mean that P and Q are 
disjoint, ie., PdQ = 0. The free variables of a term M will be denoted FV (M); 
for a vector of terms M, we will write FV (M). The abstract machine semantics 
is presented in figure 4.1; we implicitly restrict the definition to well- formed 
configurations. The first collection of rules are standard. The second collection 
of rules concern observable sharing. Rule (RefEq) first forces the evaluation of the 
left argument, and {Refl ) switches evaluation to the right argument; once both 
have been evaluated to ref constructors, variable-equality is used to implement 
the pointer-equality test. 

4.2 Convergence, Approximation, and Equivalence 

Two terms will be considered equal if they exhibit the same behaviours when 
used in any program context. The behaviour that we use as our test of equiv- 
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{r{x = M}, X, S) (r, M, #x : S) 




(Lookup) 


{r, V, #x:S)^{r{x = v}, V, s) 




( Update) 


{r, Mx, S)^ {r, M, x:S) 




( Unwind) 


{r, Xx.M, y:S)^{r, M[y/^], s) 




(Subst) 


{r, let{x = M} in N, S) ^ {r{x = m}, N, S) 


X ± dom(r, S) 


(Letrec) 


{r, ref M, S)^ ( r{x = M}, r^x, S ) 


X 0 dom(r, S) 


(Ref) 


{r, deref M, S) ^ (T, M, deref :S) 




(Derefl ) 


{r, refx, deref : S) ^ {F, x, S) 




(Deref 2) 


(F, M ^N, S)^ {F, M, i^N) : S) 




(RefEq) 


{ r, ref X, N) : S ) ^ ( F, N, (ref x : S ) 




(Refl) 


{r, <^y, ^) ■ s) ^ {F, b, S) &=h''y 

1 false 


ii X = y 
otherwise 


(Ref 2) 



Fig. 1. Abstract machine semantics 



alence is simply termination. Termination behaviour is formalised by a conver- 
gence predicate: 

Definition 4.1 (Convergence) A closed configuration ( T, M, S) converges^ 
written (T, M, S)f, if there exists heap A and value V such that 

(r, M, S)^*{A, V, e). 

We will also write identifying closed M with the initial configuration 

(0, M, e). Closed configurations which do not converge are of four types: they 
either (i) reduce indefinitely, or get stuck because of (ii) a type error, (hi) a 
case expression with an incomplete set of alternatives, or (iv) a black-hole (a 
self-dependent expression as in let x = x in x). All non-converging closed con- 
figurations will be semantically identified. 

Let C, P range over contexts - terms containing zero or more occurrences of a 
hole^ [•] in the place where an arbitrary subterm might occur. Let C[M] denote 
the result of filling all the holes in C with the term M, possibly causing free 
variables in M to become bound. 

Definition 4.2 (Operational Approximation) We say that M operationally 
approximates N, written M C TV, z/ for all C such that C[M] and C[TV] are 
closed, C[M]f implies C[TV]^. 

We say that M and N are operationally equivalent, written M = N, when 
M C TV and TV C TIL. Note that equivalence is a non-trivial equivalence relation. 
Below we present a sample of basic laws of equivalence. In the statement of all 
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laws, we follow the standard convention that all bound variables in the statement 
of a law are distinct, and that they are disjoint from the free variables. 

(Xx.M) y = M[y/x\ 

let {x = V,y = D[a;]} in C[a;] = let {x = V,y = D[r]} in C[r] 

let {x = z,y = P[x]} in C[x] = let {x = z,y = W)[z]} in C[z] 

let {x = z,y = M} in tv = let {x = z,y = M[^/x\} in 

let {5 = M} in TV ^ TV, if 5 ± FV (TV) 

qiet {y = V} in M] ^ let {y = V} in C[Tkf] 

M ^ N ^ N ^ M 

Remark: The fact that the reference constructor ref is abstract (not available 
directly in the language) is crucial to the variable-inlining properties. For exam- 
ple a (derivable) law like let {x = z} m N = TV[^q] would fail if terms could 
contain ref . This failure could be disastrous in some implementations, because 
in effect a configuration-level analogy of this law is applied by some garbage 
collectors. 

4.3 Proof Techniques for Equivalence 

We have presented a collection of laws for approximation and equivalence - 
but how are they established? The definition of operational equivalence suffers 
from the standard problem: to prove that two terms are related requires one to 
examine their behaviour in all contexts. For this reason, it is common to seek to 
prove a context lemma [Mil77] for an operational semantics: one tries to show 
that to prove M operationally approximates TV, one only need compare their 
immediate behaviour. The following context lemma simplifies the proof of many 
laws: 

Lemma 1 (Context Lemma). For all terms M and N , M H N if and only 
if for all r, S and substitutions a, (T, Ma, S ' implies (T, TVa, S)^ 

It says that we need only consider configuration contexts of the form ( T, [•], S ) 
where the hole [•] appears only once. The substitution a from variables to vari- 
ables is necessary here, but since laws are typically closed under such substitu- 
tions, so there is no noticeable proof burden. 

The proof of the context lemma follows the same lines as the corresponding 
proof for the improvement theory for call- by- need [MS99], and it involves uni- 
form computation arguments which are similar to the proofs of related properties 
for call-by- value languages with state [MT91]. 

In the full paper we present some key technical properties and a proof that the 
compiler optimisation performed after so-called strictness analysis is still sound 
in the presence of observable sharing. 
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4.4 Relation to Other Calculi 

Similar languages have been considered by Odersky [Ode94] (call-by-name se- 
mantics) and Pitts and Stark [PS93] (call- by- value semantics). A reduction- 
calculus approach to call- by- need was introduced in [AFM + 95], and extended 
to deal with mutable state in recent work of Ariola and Sabry [AS98]. The 
reduct ion- calculi approach in general has been pioneered by Felleisen et al (e.g. 
[FH92]), and its advantage is that it builds on the idea of a core calculus of 
equivalences (generated by a confluent rewriting relation on terms); each lan- 
guage extension is presented as a conservative extension of the core theory. The 
price paid for this modularity is that the theory of equality is rather limited. 
The approach we have taken - studying operational equivalence - is exemplifled 
by Mason and Talcott’s work on call-by- value lambda calculi and state [MT91]. 
An advantage of the operational-equivalence approach is that it is a much richer 
theory, in which induction principles may be derived that are inexpressible in re- 
duction calculi. Our starting point has been the call-by-need improvement theory 
introduced by Moran and Sands [MS99]. In improvement theory, the deflnition 
of operational equivalences includes an observation of the number of reduction 
steps to convergence. This makes sharing observable - although slightly more 
indirectly. 

We have only scratched the surface of the existing theory. Induction principles 
would be useful - and also seem straightforward to adapt from [MS99]. For 
techniques more speciflc to the subtleties of references, work on parametric ity 
properties of local names e.g., [Pit96], is likely to be relevant. 

5 Conclusions 

We have motivated a small extension to Haskell which provides a practical so- 
lution to a common problem when manipulating data structures representing 
circuits. We have presented a precise operational semantics for this extension, 
and investigated laws of operational approximation. We have shown that the 
extended language has a rich equational theory, which means that the semantics 
is robust with respect to program transformations which respect sharing prop- 
erties. 

The extension we propose is small, and turns out to be easy to add to existing 
Haskell compilers/interpreters in the form of an abstract data- type (a module 
with hidden data constructors). In fact similar functionality is already hidden 
away in the nonstandard libraries of many implementations.^ A simple imple- 
mentation using the Hugs-GHC library extensions is given in the full version of 
the paper. 

The feature is likely to be useful for other embedded description languages, and 
we briefly consider two such applications in the full paper: writing parsers for 
left-recursive grammars, and an optimised representation of decision trees. 



^ www.haskell.org/implementations/ 
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Abstract. In this paper we prove that for timed algebras may test- 
ing is much stronger than it could be expected. More exactly, we prove 
that the may testing semantics is equivalent to the must testing seman- 
tics for a rather typical discrete timed process algebra when considering 
divergence- free processes. This is so, because for any adequate test we 
can define a dual one in such a way that a process passes the original 
test in the must sense if and only if it does not pass the dual one in the 
may sense. It is well known that in the untimed case by may testing we 
can (partially) know the possible behaviors of a process after the instant 
at which it diverges, which is not possible under must semantics. This is 
also the case in the timed case. 

Keywords: process algebra, time, testing semantics, must, may. 



1 Introduction and Related Work 

Testing semantics is introduced in [DH84, Hen88] in order to have an abstract 
semantics induced by the operational one, which allows us to compare processes 
in a natural way. Besides, it is a formalization of the classical notion of testing 
of programs. Tests are applied to processes generating computations that either 
have success or fail. But as processes are non-deterministic it is possible that the 
same process sometimes passes a test and sometimes fails to do it. This leads us 
to two different families of tests associated to each process: those that sometimes 
are passed, and those that always are passed. Two processes are equivalent if 
they pass the same tests, but as we have two different ways to pass tests, we 
obtain two different testing semantics that are called may and must semantics. 
In this paper we will study testing semantics for timed process algebras, which 
is also the subject of the Ph.D. Thesis of the first author [Lla96]. As far as we 
know, there has been not too much previous work on the subject, but [HR95] is 
an interesting related reference. 

In the untimed case the may semantics is just trace semantics, while the must 
semantics is more involved, and so we need acceptance trees [Hen88] in order to 
characterize it. But when time is introduced the may semantics also becomes 
more complex, since, as usual, we are assuming the ASAP rule, which allows us 
to detect by means of tests not only the actions that have been executed, but 
also those that could have been chosen instead. To be exact, we will prove in this 
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paper that for non-divergent processes the may testing semantics is equivalent 
to the must testing semantics for a rather typical timed process algebra. This is 
so, because for any adequate test we can define a dual one in such a way that a 
process passes the original test in the must sense if and only if it does not pass 
the dual one in the may sense. 

The situation is much more complicated in the case of non-divergent free 
processes. It is well known that in the untimed case by may testing we can 
(partially) know the possible behaviors of a process after the instant at which 
it diverges, what is not possible under must semantics. This is also the case in 
the timed case. In fact, we will present a couple of examples showing that in the 
general case may and must testing orderings are incomparable each other. 

A related result can be found in [Sch95]; there an operational semantics for 
Timed CSP [RR88, Ree88] is presented, and it is proved that the timed fail- 
ures model is fully abstract with respect to the may testing semantics. Another 
interesting result can be found in [BDS95], where they prove that the failures 
semantics for TE-LOTOS is the biggest congruence contained in the trace se- 
mantics. In our case trace semantics is far to be equal to may semantics, but 
it is interesting to observe that it is not a congruence, also for our language, in 
spite of the fact that we have no problems with internal actions in the context 
of the choice operator, like in CCS or LOTOS. 

In fact, in a previous (erroneous) version of this paper we tried to prove that 
may testing equivalence was not a congruence for the parallel operator, but if 
we consider the biggest congruence contained in it, for non-divergent processes 
we just obtain the must equivalence. We are very grateful to Lars Jenner for 
pointing us the mistake in that paper, and also for communicating us that he 
had also obtained the same equivalence [Jen98] between may and must testing 
semantics for the case of Petri Nets. It would be very interesting to study the 
relation between both results, mainly because the way in which the semantics 
are defined and the equivalence result is proven are not trivially connected. More 
in general, it would be nice to study which are the properties that a model of 
real time processes has to fulfill in order that the corresponding may and must 
testing semantics will be dual each other. 

2 Syntax and Operational Semantics 

In this section we describe the syntax of the language we will consider. In order 
to focus on the main characteristics and problems of timed algebras, we intro- 
duce a simple timed process algebra which however contains the main operators 
that characterize such an algebra. More exactly, we are talking about those we 
consider to be the main operators of a high level timed process algebra. So, we 
do not consider tic actions measuring time in a explicit way. Nevertheless it is 
possible to translate our high level operators to a low level language containing 
such kind of operators, and in this way similar results to those in this paper could 
be obtained for a language such as the one in [HR95]. In our language, time is 
introduced via the prefix operator; actions must be executed at the indicated 
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Table 1. Operational Semantics. 



time, and we will consider a discrete time domain T. We consider a finite set of 
actions Act^ an internal event r ^ Act^ and the set of events £ = Act U {r}. We 
denote by V the set of terms generated by the following grammar: 

P ::= STOP | DIV | et ; P | Pi □ P 2 | n P 2 | ^ IU Q | ^ \ -4 | x | RECx.P, 

where x is a variable process, A C Act, e E £, and t eT. A process is a closed 
term generated by the previous grammar. We denote the set of processes by CP. 

In order to define the operational semantics of the language we need an 
auxiliary function Upd(t, P), which represents the pass of t units of time on 
the process P. This function is defined in Table 1. Looking at the operational 

Tt^ 

semantics, we observe that the function Upd(P, t) is only used when P /> for 
t' < t; so, the way Upd(rC ; P, t) is defined when t' < t is not important, and it 
is only included for completeness. 

The operational semantics of CP is given by the relation ^ ^ > C CP x (C x 

X CP defined by the rules in Table 1. The intuitive meaning of P — ^ Q 
is that P executes event e at time t to become Q. Time is always relative to 
the previous executed action. So rule [PRE] indicates that the only event that 
process et ; P can execute is e and the time when it is executed is t. The negative 
premises in the rules are included to ensure the urgency of internal actions. Note 
that non-guarded recursive processes execute infinitely many internal actions 





Relating May and Must Testing Semantics 



77 



in a row, all of them at local time 0, just like process DIV. We say that such 
processes are divergent. 

Since some rules have negative premises we have to provide a way to guar- 
antee that the generated transition system is consistent. This is achieved by 
defining a stratification^ as detailed in [Gro93]. We consider the following func- 
tion 

f{P — ^ Q) = Number of operators in P 
that is not difficult to check that is indeed a stratification. 

The main characteristics of the operational semantics considered are urgeney 
and finitely branching. In fact, the results of this paper can be also obtained 
for any process algebra whose operational semantics fulfills these two properties. 

First, internal actions are urgent: P — ^ P' ^ P Ve G f and t' > t. 

Nevertheless internal actions have no greater priority than observable actions to 
be executed at the same time (although, as in the untimed case, the execution of 
the internal actions that are offered at the same time that them could preclude, 
in a nondeterministic way, the execution of those observable actions). Finally, 
observable actions to be executed before any internal action are not affected by 
the existence of later internal actions. 

Another important fact is that this operational semantics is finitely branch- 
ing. In an untimed process algebra this property could be set as: for a given 
proeess P there exist a finitely many number of transitions that P ean exeeute. 
This property is also satisfied by the timed process algebra that we consider, 
but, as in the future we desire to extend the results in this paper to more gen- 
eral timed process algebras, we have modified this property in the adequate way. 
For instance, if we would allow intervals in the prefix operator, as in a[0..oo]; P, 
then the above property would be no longer true. Thus we consider instead the 
following one: Given e G f and t eT, the set {P' \ P — ^ P'} is finite. 

To conclude this section we define the set of timed actions that a process P 
can execute, TA(P) = {at \ a G Aet, 3P' : P — ^ P'}. 

3 Testing Semantics 

In this section we define the testing semantics induced by the operational seman- 
tics above. Tests are just finite processes^ but defined by an extended grammar 
where we add a new process, OK, which expresses that the test has been passed. 
More exactly, tests are generated by the following B.N.F. expression: 

T ::= STOP | OK | et ; T | Ti □ T 2 | Ti □ T 2 | Ti |U ^2 | T \ a. 

The operational semantics of tests is defined in the same way as for processes, but 
only adding the following rule for the test OK^: [OK] OK — ^ STOP. Finally, 
we define the composition of a test and a process: P | T = (P \\Act T) \ Aet. 

^ We could also allow recursive processes as tests, but they are not allowed, since this 
would not add any additional power. 

^ To be exact, we should extend the definition of the operational semantics for pro- 
cesses and tests to mixed terms defining their composition, since these mixed terms 
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Definition 1. Given a computation of P\T 

P I T = Pi I Ti ^ P 2 I T2 • • -Pfe I Tfe Pfe+i I Tfe+i • • • 

we say that it is Complete if it is finite and blocked (no more steps are allowed)^ 
or infinite. It is Successful if there exists some k such that Tk > . 

Now we can give the basic definitions of may and must test passing. 

Definition 2. 

— We say that P must pass the test T (P must T ) iff any complete computation 
of P \ T is successful 

— We say that P may pass the test T (P may T ) iff there exists a successful 
computation of P \ T. 

— We write P Q iff whenever P mustT we also have Q mustT. 

— We write P Q iff whenever P may T we also have Q may T. 

— Finally, we write P ^may Q when P Q and Q ^^^^y P, and similarly 
for the must case. 

4 States, b- Traces, and Barbs 

In this section we will recall some definitions and results introduced in [LdN96]. 
In order to characterize the testing semantics we will consider some special kind 
of sets of timed actions which we call states. They represent any of the possible 
local configurations of a process. In a state we have the set of timed actions that 
are offered, and the time, if any, at which the process becomes divergent. We 
consider that a process is divergent if it can execute in a row an infinite number 
of internal actions, all of them at time 0. Note that divergent processes only 
must pass trivial tests, that is, those initially offering the action OK. In order to 
capture divergence, we have introduced a new element i? ^ Act that represents 
undefinition. 

Definition 3. We say that A C [Act U {I7}) x T is a state if 

— There is at most a single Qt G A, i.e., fit, Qf G 4 ^ t = t' . 

— If fit ^ A then t is the maximum time in A, i.e., fit, at' e A t' < t. 

We will denote by ST the set of states. 

Therefore, a state contains a set of timed actions. If a process P is in a state 
A and at ^ A then we have that P can execute action a at time t. When fit ^ A 
we have that the process would become divergent at (local) time t, if it has not 
executed before any of the actions in the state. 



are neither processes nor tests, but since this extension is immediate we have pre- 
ferred to omit this formal definition. 
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Example 1. Let P = (aO ; al ; STOP) n ((61 ; aO ; STOP) □ (cO ; STOP)) . As indicated 
by Definition 5, it has (initially) two states: {aO} and {61, cO}. If we consider 
the process Q = P □ rl ; DIV, it has as states {aO, 121} and {c0,121}. Note 
that action 6 at time 1 is not included in the second state. This is so, because 
states are used to characterize the must semantics and we can easily check that 
(61 ; STOP) □ (rl ; DIV) :^must ; DIV. Nevertheless, as we will see in Example 2, 
this action will appear in a b-trace as an executed action of this process. 

Next we give some auxiliary definitions over states that will be used later: 

— We define the function nd(-) : ST ^ T U {oo}, which gives us the time at 
which a state becomes undefined (not defined function), by nd(A) = 6 if 
fU ^ A and nd(A) = oo otherwise. 

— For each time t eT,we define the time addition and time restrietion opera- 
tors over a set of states by A + t = {a{t^t')\ at' G A} and A]t = {at' \ at' G 
A and t' < t). 

— If A G 5T, we define its set of timed actions by TA(A) = A 1 nd(A) = 
A n {at I at G A, a ^ 12}. 

— If A C ST and t G T^ we write A < t (resp. A < t) iff for all at' G A we 
have t' <t (resp. t' <t). 

A barb is a generalization of an acceptance [Hen88]. Additional informa- 
tion must be included in order to record the actions that the process offers 
before any action has been executed. First we introduce the concept of b-trace, 
which is a generalization of the notion of trace. A b-trace, 6s, is a sequence 
AiaitiA 202 t 2 • • • Anantn that represents the execution of the sequence of timed 
actions aiti 02 t 2 * * ’ cintn in such a way that after the execution of each prefix 
aiti • • • Oi-iti-i the timed actions in A^ were offered (but not taken) before ac- 
cepting aAi. Then a barb is a b-trace followed by a final state, that represents 
the reached configuration of the process after executing the b-trace. The time 
values that appear in barbs and b-t races are relative to the previous executed 
action. 

Definition 4. 

— b-traces are finite sequences, bs = Aiafii • • • Anantn, where n > 0, aiAet, 
ti gT, Ai C Aet X T, and if a' t' G A^ then t' <ti. We take length (6) = n; 
when n = 0 we have the empty b-traee denoted by e. 

— A barb b is a sequenee b = bs • A where bs is a b-trace and A is a state. We 
will represent the barb e • A simply by A, and so we can consider states as 
(initial) barbs. 

The states of a process P are computed from its complete computations of 
internal actions. For any computation we record the actions that were offered 
before the execution of the next internal action. For divergent computations we 
also record the time of divergence . 

Definition 5. For a proeess P, the set A{P) is the set of states A G ST that 
are generated from the complete (initial) eomputations of internal aetions of P 
as deseribed below: 
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Each infinite computation P = P\ — 
generates the state A G A(P) given by 



Tt2 ^ rtk 

^ -Pk 



Pk+l 



A = U ((TA(fl) I W + (*) u { f *> 

zGIN 



— Each finite blocked computation P = P\ 






rto 



• Pn-l 



Tin-1 



generates the state (TA(Pn) + t'^) U Ur=i(('^'^(^0 1 ^0 + ^ ^(P) 



where, in both cases, we take E = Yl]=i • 

In order to define the b-traces of a process we introduce the following notation: 
Let t e T, bs = Aiafii • bsi be a b-trace, and A C Act x T be a set of timed 
actions such that A < t, then we take {A, t)\Jbs = {A U {Ai + t))a{ti P t) -bsi. 



Definition 6 . Let P, P' be processes and bs a b-trace. We define the relation 

P P' as follows: 

- P^P. 

- If P Pi, and Pi P' with bs' 7 ^ t then P p> 

- IfP^ P,, and Pi JU P' then P > P' . 



We define set of barbs of P by Barb(P) = {t»s ■ A \ P > Q and A G A{Q)}. 

States, barbs and b-traces are closely related. The states of a process are its 
initial barbs, those whose b-trace is empty. If P > Q then there exists a 

state A' such that bs • A' e Barb(P) and A' ] t = A. Finally, if bs ^ A e Barb(P) 

. . ^ 117-1 bs(A]t)at 

and at e A then there exists a process Q such that P ^ Q 

Example 2. Let us consider the processes P and Q introduced in Example 1. 
The barbs of P are {aO}, {61, cO}, 0aO{al}, 0aO0al0, 0cO0, {c0}61{a0}, and 
{cO}6l0aO0. The barbs of Q are those of P removing {61, cO} and adding 
{i71,c0}. The fact that {cO}6l0aO0 G Barb(P) indicates the that P can ex- 
ecute the b-trace {cO}6l0aO and it reaches a state that offers 0, i.e., the process 
deadlocks. The execution of the b-trace {cO}6l0aO indicates that the process 
can execute the action 6 at time 1 and then, immediately, the action a (local 
time 0, but global time 1); but before executing action 6 the process could also 
execute the action c at time 0. If we know that this b-trace has been executed, 
we can conclude that c was not offered at time 0 by the environment. Note that 
{c0}61{a0} G Barb(Q), in spite of the fact that {c0,61} ^ Barb(Q). 

We will use barbs and b-traces to characterize the testing semantics. This will 
be done by defining a preorder between sets of barbs and another one between 
sets of b-traces. In order to define them, we need the following ordering relations 
between b-traces and barbs: 

^ Note that < oo iff there exists some no such that ti = 0 for all i > no. 
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Definition 7 . 

— We define the relation <C ^ between h-traces as the least relation that satisfies: 
1 . fc <C fc; 2 . If bs' <C bs and A' C A then A' at • bs' <C Aat • bs. 

— We define the relation <C between barbs as the least relation that satisfies: 

1 . Ifbs^ bs' are b-traces such that bs' <C bs, and A, A' are states such that 
nd(M') < nd(M) and TA(M') C A, then bs' • A' bs • A. 

2 . If A' is a state, b = Aiafii • b' is a barb such that nd(M') < ti and 
TA(M') C Ai, and bs' <C bs then bs' • A' <^bs • {Aiafii • b'). 

Intuitively, a b-trace bs is worse than another one bs', if the actions that appear 
in both b-traces are the same, and the intermediate sets Ai that appear in bs are 
smaller than those A'- appearing in 6s'. For barbs, we must notice that whenever 
a process is in an undefined state, which means t = nd(M) < oo, it must pass no 
test after that time. Barbs and b-traces are introduced to characterize the testing 
semantics. As shown in [LdN96], to characterize the must testing semantics it is 
enough to extend the preorder above to sets of barbs and b-traces, as follows: 

Definition 8. Let Bi and B2 be sets of barbs, we say that Bi <C B2 iff for any 
62 G B2 there exists b\ G B\ such that 61 <C 62. 

The preorder <C can be used to characterize that induced by the must testing 
semantics, which means that we can prove the following Must Characterization 
Theorem (proved in detail in [LdN96]): P Q Barb(P) <C Barb(Q). 

In the following sections we will characterize the may testing semantics. 

5 Divergence Free Processes 

First, the relationship between may and must testing semantics will be studied 
in the case when the involved processes are divergence free. We will show that, 
rather surprisely, in this case, both relations are symmetric each other: 

pc Q ^ Q\Z p 

^must ^ ^ ^may 

Even more, we will show that may testing can be viewed as the dual relation of 
must testing, in the sense that we can find a family of standard tests character- 
izing must semantics, such that if we define the dual of a test, in a very simple 
and natural way (to be precised later), we have that a process passes a test of 
the family in the must sense iff it does not pass its dual test in the may sense. 

Intuitively, a process P is divergent, and then we write P fi, if P can execute 
in a row infinitely many internal actions, all of them at (local) time 0. We say 
that P is divergence free when no execution of P leads us to a divergent process 
P'. For instance, the process P = al;ST0P is divergence free, while Q = al;DIV is 
not, because of the existence of the computation Q > DIV. We could formally 
define divergence freedom directly from the operational semantics, but since this 
definition is a little cumbersome, we present instead the following equivalent 
characterization in terms of barbs: 

Note that the symbol is overloaded, since it is used for both b-traces and barbs. 



4 
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Definition 9. We say that a process P is divergence free ijf for each barb 
bs ' A e Barb(P) we have nd(A) = oo. 

Next we will prove the left to right arrow: P Q ^ Q P- We will use 

the following characterization of the must testing ordering, that we have proved 
in [LdN96]: P Q Barb(P) <C Barb(Q). So it is enough to prove: 

Proposition 1. Barb(Q) <C Barb(P) ^ P Q. 

Proof, (sketch) Let T be a test such that P may T; then there exist some 
process P' and some test T' such that T' and there exists a computation 
from P I T to P' I T'. So, there exist P" and T" and a couple of b-traces hs = 

AiOiti • • • AnOntn and hs' = A'^aip • • • A'^a^tn such that P P", T > 
T", Ai^ A'- = 0, and there exists a computation from P | T to P" | T" and 
another one from P" \ T" to P' | T'. 

Now from the computations of P" and T" to P' and T' we get two states 
A e A(P") and A' G A(T") satisfying An A' = 0. 

Since Barb(Q) <C Barb(P) and P and Q are divergence free, there exist 
some barb b" = hs" • A" G Barb(Q) with bs" = A'/aip • • • A'fantn and some 

bs" 

processes Q' and Q" such that A'f C A" C A, Q > Q" and there is a 
computation from Q" to (5^ that generates the state A" by applying Definition 5. 
So we have A'f H A' = 0 and A" n A' = 0 ^ so we have a computation from Q \ T 
to Q" I T" . Since Q is divergence free, we also obtain a computation from Q" \ T" 

to Q' I r. 

Next we prove the right to left side of the equivalence. For it, we introduce a 
new family of standard tests characterizing must semantics, which means that 
whenever we have P Q we can find some test T in the family such that 

P must T but Q mi/st T. These tests are similar, but not exactly the same, that 
those with the same name and property that we have used in [LdN96]. Although 
in order to relate ^ and ^ , we could also use here those tests, we have 

^may ^must ’ 

preferred to use instead the new family, because by considering the corresponding 
dual tests we can directly obtain that relationship, thus emphasizing the duality 
between must and may passing of tests. The tests that constitute the new family 
of standard tests are those obtained by applying the following definition: 
Definition 10. Given a barb b and a set of barbs B such that there is no barb 
b' e B such that b' <C b, we say that a test T is well formed with respect to B 
and b when it can be derived by applying the following rules: 

— If b = A^ we take any finite set A\ C Act x T such that for any A' ^ B we 
have Ai n A' fy 0. Then, taking 

T 2 = ( CH at ; OK^ art ; STOP , where t > maxjt' I at' G Ai} 

\ateAi J 

we have that T = Ti □ T 2 is well formed with respect to B and b. 

— If b = Aat ' bi, we consider any finite set A\ C Act x T with Ai H A = 0 
and such that any barb A' at • 6'^ G P either satisfies A' H Ai fy 0 or b'l ^ hi. 
When Ai 0, we consider the test 

T 2 = □ at; OK. 

dt^Ai 
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Besides, taking the set of barbs Bi = {b' \ A' C A and A' at • b' G B}, when 
B\ ^ 0, we can take as T\ any well formed test with respect to B\ and bi. 
Then we have that 

( TiDT2nit; OK if ^ 0, Bi ^ 0 

T = <Ti if Ml ^ 0, = 0 

[ T2 nit ; OK if Mi = 0, Bi ^ 0 

is well formed test with respect to B and b. 

Given an arbitrary set of barbs B and a barb b, it is possible that there is no 
well formed test T with respect to B and b, because the finite set Mi required in 
the first part of the definition might not exist. But, as the operational semantics 
is finitely branching, for any B = Barb(P) and b G Barb(Q) such that there is 
no 6' G P with b' <C b, there is some well formed test T with respect to B and b. 
Finally, dual tests are defined as expected: 

Definition 11. Let T be a test, we define its dual test, T*, by interchanging the 
tailing occurrences of STOP and OK in T . 

The well formed tests and their duals satisfy the following: 

Proposition 2. Let B be a set of barbs aba barb such that and there is no 
b' e B such that b' <C b. Let us consider T a well formed test with respect to B 
and b, then: b G Barb(Q) ^ Q mjfstT and Q mayT"^. 

B = Barb(P) ^ P mustT and P mjiy T*. 

Proposition 3. P Q ^ Q p. 

Proof. Let us suppose that Q P, then there must exist some b G Barb(P) 

such that there is no b' G Barb(Q) verifying b' <C b. Then we take a well formed 
test T with respect to Barb(Q)and b such that we have Q must T and P mi/st T. 
Then by applying the previous proposition, we have P may T* and Q m^y T*. 
This contradicts our hypothesis, P Q. 

Finally, as a consequence of Propositions 1 and 3, we obtain the may testing 
characterization theorem for divergence-free processes that we were looking for: 

Theorem 1. P^ Q . P 

^may ^ ^ ^must 

6 Processes with Divergences 

Once we have studied the relation for non divergent processes we pro- 

ceed to study it in the general case. Since in the untimed case we already had 
— must fy— may when divergences appear, we could expect that in the untimed case 
we would have the same result. This is indeed the case, as the following example 
shows 1 : 

Example 3. Let us consider the processes P = DIV and Q = DIV □ al ; STOP. It 
is easy to show that P and Q are equivalent under must testing, i.e., P Q 

and Q ^ , P; but, on the other hand, we have Q 5 P. 

ITlLlSTj ITlojy 
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We also have — must 2— may ? what means that we really need divergence free- 
dom in order to prove Proposition 3. To show it, let us consider 
P = ((aO ; STOP) □ (r2 ; DIV)) □ ((aO ; STOP) □ (60 ; STOP) □ (rl ; DIV)) and 

Q = ((aO ; STOP) □ (r2 ; DIV)) □ ((aO ; STOP) □ (60 ; STOP) □ (r2 ; DIV)). 

It is not difficult to show, by using Theorem 2, that under may testing semantics 
both processes are equivalent, i.e., P c^may Q; but we have Q and so 

P 2must Q- 

In order to characterize in the general case, we introduce the following 

Definition 12. Let bs • A and bs' • A' be barbs, we say bs' • A' <Cmay bs • A iff 
bs' <C bs, nd(A') > nd(74), and A' ] nd(A) C A. Then we define the relation 
^may between sets of barbs, by saying Bi <Cmay P 2 iff for ony 61 G Bi there 
exists 62 G B 2 sueh that 62 <Cmay 

The rest of the section is devoted to prove the May Characterization Theorem 
for the general case: P Q Barb(P) <Cmay Barb(Q). It is an immediate 

consequence of Propositions 4 and 5 below. In the following, tbs(6s) stands for the 
duration of bs, defined by taking tbs(e) = 0 and tbs(^initi • 6si) = p + tbs(6si). 
To prove the left to right arrow of the theorem, we need a special kind of tests: 

Definition 13. Let b he a barb, we inductively define the test Tfb) as follows: 
T(€- A) = rt;0Ka □ aV ; STOP (t = ndU)) 

a't'gA, t'<t V V // 

TiA'a't' ■b,t) = a't' ; T(b, a,t)o □ a”t” ; STOP 

t"<t 

For these tests it is easy to check the following: 

Lemma 1. P may T(b) iff there exists b' G Barb(P) sueh that b' <Cmay b. 

Proposition 4. P S^^ay ^ ^ Barb(P) <may Barb(Q). 

Proof. Let us consider b = bs • A e Barb(P). Then we have P may T(6). As 
P Q we also have Q may T(6), and by applying the previous lemma we 

get the desired result. 

Proposition 5. Barb(P) <may Barb(Q) ^ ^ £may Q* 

Proof, (sketch) The proof of this proposition is very similar to that of Proposi- 
tion 1. The only difference is the way that the adequate barb b" = bs" • A" G 
Barb(Q) is found. To do it in this case we have to use the fact that P <Cmay Q- 
From this fact we can also conclude that this barb verifies bs" <C bs, A"] nd(A) C 
A and nd(A") > nd(A). Then for the corresponding processes Q' and Q" we ob- 
tain a successful computation from Q | T to Q' | T'. 

Finally we have the desired result: 

Theorem 2. P Q 



Barb(P) <may Barb(Q). 
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7 Conclusions and Discussion 

In this paper we have defined and characterized may testing semantics for timed 
process algebras. We have distinguished the cases when the involved processes 
are divergence free or not. When considering divergence free processes we have 
proved that the may testing semantics is the inverse preorder of must testing 
semantics. Moreover, for a representative class of tests we have define the notion 
of dual test. This class of tests is powerful enough to distinguish non-related 
processes, and for any test T in that class it holds that P must T if and only if 
P m^y T*, where T* is the dual test of T. So we can conclude that, in some sense, 
the may testing semantics is dual to the must testing semantics. The problems 
that appear when considering non- divergence free processes are similar to those 
appearing in the untimed case. 

It would be interesting to compare in detail our results with those in [Sch95]. 
The time model used there is continuous (IR in fact), but this point deserves 
no special attention. In fact it, some people working on this area claim that 
continuous time is more general than discrete time. If that were the case, any 
result obtained for a reasonable continuous timed process algebra could be trans- 
fered to the assoeiated discrete timed algebra. Applying this argument we could 
conclude from our results and those in [Sch95] that must testing semantics is 
equivalent to failures semantics. In fact, we conjecture that this is indeed the 
case, but this cannot be concluded just by applying this naive argument. 

Reasonable continuous and discrete timed models are not trivially compara- 
ble. For instance, process at;P cannot be modeled in Schneider’s framework. The 
candidate to simulate in that model such a process would be WAIT t;{a ^ P \> 
stop). But in this way we do not obtain an exact translation, since the meaning 
(using our syntax) of the latter term is the same as that of aO ; P FlrO ; STOP. The 
reason why this happens is that, in Schneider’s model or any other reasonable 
continuous time model, when a process rejects an action at time 0, it must also 
do it at some instant e > 0. This naturally reflects the continuous character of 
real numbers. As a consequence, we cannot assert any property related to a con- 
crete instant of time. Therefore, when comparing continuous and discrete timed 
models much more care is needed. 
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Abstract. We present a framework for formal reasoning about the be- 
haviour of distributed programs implementing open distributed systems 
(ODSs). The framework is based on the following key ingredients: a spec- 
ification language based on the /x-calculus, a hierarchical transitional se- 
mantics of the implementation language used, a judgment format allow- 
ing parametrised behavioural assertions, and a proof system for proving 
validity of such assertions which includes proof rules for property de- 
composition. This setting provides the expressive power for behavioural 
reasoning required by the complex open and dynamic nature of ODSs. 

The utility of the approach is illustrated on a prototypical ODS. 

1 Introduction 

For a few years now, the Formal Design Techniques group at the Swedish Insti- 
tute of Computer Science has pursued a programme aimed at enabling formal 
verification of complex open distributed systems (ODSs) through program code 
verification. While previous work by the group has been predominantly directed 
towards establishing the mathematical machinery [5], basic tool support [3], and 
performing case studies [2], the present paper focuses on methodological aspects 
by motivating the chosen verification framework and by showing on an example 
proof how suitable this framework is in practice for formal reasoning about the 
behaviour of ODSs. 

A central feature of open distributed systems as opposed to concurrent sys- 
tems in general is their reliance on modularity. Large-scale open distributed 
systems, for instance in telecom applications, must accommodate complex func- 
tionality such as dynamic addition of new components, modification of inter- 
connection structure, and replacement of existing components without affecting 
overall system behaviour adversely. To this effect it is important that component 
interfaces are clearly defined, and that systems can be put together relying only 
on component behaviour along these interfaces. That is, behaviour specification, 
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and hence verification, needs to be parametric on subcomponents. But almost 
all prevailing approaches to verification of concurrent and distributed systems 
rely on the assumption that process networks are static, or can safely be approx- 
imated as such, as this assumption opens up for the possibility of bounding the 
space of global system states. Clearly such assumptions square poorly with the 
dynamic and parametric nature of open distributed systems. 

The decision to focus on verification of actual program code rather than 
addressing the easier task of verifying specifications comes from the observation 
that still, after all these years of advocating formalised specifications as a means 
to improve the quality of products, in industry today only rarely does one find 
such formalised specifications. 

We summarise the framework in Section 2 as it has developed throughout the 
project, and then illustrate its merits in Section 3 by focusing on a prototypical 
distributed systems example where a set data structure is implemented through 
the coordination of a dynamically changing number of processes. The example is 
programmed in the Erlang language [1], a functional programming language with 
support for distribution and concurrency, that is nowadays used in numerous 
telecommunication products developed by the Ericsson corporation. To illustrate 
the verification method we formulate and sketch a proof of a key property of the 
set implementation. 



2 Verification of Open Distributed Systems 

Eirst we examine the characteristics of programming platforms for open dis- 
tributed systems, and from this description derive requirements on the formal 
machinery necessary to permit verification of open distributed systems. 



2.1 Programming Platforms for ODSs 

Programming platforms provide the necessary functionality for open distributed 
systems. To name but a few services typically provided: 

1. The basic building blocks that can execute concurrently (processes and/or 
threads, concurrent objects). 

2. Eacilities for dynamically creating new executing entities. 

3. Means for coordination of, and communication between, concurrently exe- 
cuting entities. Eor example through semaphores, a shared memory, remote 
method calls, or asynchronous message passing. 

4. Support for implicitly or explicitly grouping executing entities into more 
complex structures such as process groups, rings of processes or hypercubes. 

5. Support for fault detection and fault recovery. 

Like large software systems in general, ODSs are usually built from libraries 
of components. These ideally use encapsulation to provide clean interfaces to the 
components to prevent their improper use. 
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2.2 A Framework for Formal Reasoning about ODS Behaviour 

Semantics of ODSs. To reason in a formal fashion about the behaviour of 
an ODS, a formal semantics of the design language in which the system is de- 
scribed is needed. This can be done in different styles, depending on the intended 
style of reasoning. Our methodology is mainly tailored to operational seman- 
tics^ although other formal notions of behaviour are derivable in our framework, 
supporting reasoning in different flavours. Operational semantics are usually 
presented by transition rules involving labelled transitions between structured 
states [11]. A natural approach to handling the different conceptual layers of 
entities in the language, supporting modular (i.e. compositional) reasoning, is to 
organise the semantics hierarchically, in layers, using different sets of transition 
labels at each layer, and extending at each layer the structure of the state with 
new components as needed. This approach will be illustrated in the example of 
the Erlang programming language in Section 2.3. 



Specification Language. Reasoning about complex systems requires composi- 
tional reasoning^ i.e. the capability to reduce arguments about the behaviour of 
compound entities to arguments about the behaviours of its parts. To support 
compositional reasoning, a specification language should capture the labelled 
transitions at each layer of the transitional semantics. Poly- modal logic is par- 
ticularly suitable for the task, employing box and diamond modalities labelled 

by the transition labels: a structured state s satisfies formula {a)<P if there is 

cx 

an a-derivative of s (i.e. a state s' such that s — > s' is a valid labelled transi- 
tion) satisfying <P^ while s satisfies [a]<P if all ^-derivatives of s (if any) satisfy 
Additionally, state predicates are needed to capture the “locaf’ , unobservable 
characteristics of structured states, such as e.g. the value of a local variable. The 
presence of recursion on different layers requires also the specification language 
to be recursive. Adding recursion in the form of least and greatest fixed-points 
to the modalities described above results in a powerful specification language, 
broadly known as the fi-calculus [10, 8]. Roughly speaking, least fixed-point for- 
mulas fiX.cj) express eventuality properties, while greatest fixed-point formulas 
h>X.(j) express invariant properties. Nesting of fixed points allows complicated 
reactivity and fairness properties to be expressed. 



Parametricity. As explained above, reasoning about open systems requires 
reasoning about their interface behaviour relativised by assumptions about cer- 
tain system parameters. Technically, this can be achieved by using Gentzen-style 
proof systems, allowing free parameters to occur within the proof judgments of 
the proof system. The judgments are of the form T h Z\, where T and A are sets 
of assertions. A judgment is deemed valid if, for any interpretation of the free 
variables, some assertion in A is valid whenever all assertions in T are valid. Pa- 
rameters are simply variables ranging over specific types of entities, such as mes- 
sages, functions, or processes. For example, the proof judgment x : ^ h P{x) : 
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states that object P has property <P provided the parameter x of P satisfies 
property 



Compositionality. Reducing an argument about the behaviour of compound 
entities to arguments about the behaviours of its parts can be achieved through 
parametricity: We can relativise an assertion P[Q/x] : about the compound 

object P to a certain property ^ of its component Q by considering Q as a 
parameter for which property P is assumed, provided we can show that Q indeed 
satisfies the assumed property 4^. Technically this can be achieved through a 
term-cut proof rule of the shape: 

PhQ:P,zA P,x:PhP:^,zA 
PhP[Q/x] 



Recursion. When reasoning about programs in the presence of recursion on 
different layers, one traditionally relies on different forms of inductive reason- 
ing^ such as mathematical induction, complete induction and well-founded in- 
duction. Of these, the latter is the most general one. Through a sophisticated 
mechanism for generalized-loop detection, fixed-point approximation, and dis- 
charge, our proof method supports well-founded inductive reasoning, as well as 
proofs by co-induction [9] which is needed when reasoning about entities of non- 
well-founded nature such as infinite streams. Recursion on any layer like data, 
functions, and processes is treated uniformly in this framework. 

The mechanism itself is presented and studied in detail in [5]; here we only 
give an idea of the approach. Assume that we are to prove that repeated pop- 
ping of elements from a stack must eventually fail unless interleaved with the 
pushing of new elements. The initial proof goal will roughly have the shape 
s.’stack h reppop(s) ’.terminates^ where stack is the type of stack expressed as 
a formula describing the transitional semantics of stacks, reppop is a function 
implementing repeated popping, and terminates is a formula expressing termi- 
nation of computation. Since the stack definition and the formulas are recursive, 
in the process of proof construction we will eventually reach a point where we 
have to prove the same termination property, but for a modified stack. In fact, 
this new proof goal will be an instance of the more general initial goal, and in 
this way we will have discovered a generalised loop in the proof tree. But along 
this loop we will have made progress in that we will have decreased the value of 
an ordinal approximating a least fixed-point formula describing the stack. This 
fact will allow the new goal to be discharged with respect to the initial goal, 
analogously to the way assumptions are discharged in natural deduction, thus 
terminating successfully the respective branch in the proof tree. 

2.3 Programming ODSs in Erlang 

Erlang [1] is at its core a functional programming language, extended with a 
notion of processes and primitives for message passing. Erlang has a small set of 
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powerful constructs, and is therefore suitable as both a modeling as well as an 
implementation language for ODSs consisting of a high number of light-weight 
processes. It is especially suitable for telecommunications software. In contrast 
to most other functional programming languages, Erlang has seen heavy use 
in industry. In a recent project at Ericsson where a state-of-the-art high-speed 
ATM switch was developed [4], figures of 480 000 lines of Erlang source code 
have been reported, compared to 330 000 lines of C code (most of it in the form 
of imported protocol libraries), and approximately 5 000 lines of Java code. A 
frequently voiced opinion is that a chief reason for the quick development of this 
product, and with resulting excellent quality, is the fast code-debug- replace cycle 
made possible through the introduction of Erlang. Another important reason 
for the success of Erlang in such projects clearly are the accompanying libraries 
which provide support for many aspects of developing and maintaining large 
telecommunications applications. There is for instance support for distributed 
data base access, error recovery, and code replacement during runtime. A brief 
overview of the Erlang fragment used in this paper can be found in Appendix A.l. 



Formal Semantics Our semantics for Erlang is a small-step operational one [6]. 
The basic message of the previous section with respect to language semantics 
was the desirability to mimic the conceptual view that a programmer has of a 
system built using Erlang in the language semantics. The semantics developed 
here matches closely the hierarchic structure of the Erlang language. Eirst the 
Erlang expressions are provided with a semantics that does not require any 
notion of processes. The actions here are a computation step r, an output pidlv^ 
read{q^v) for reading a value v from the queue of the process in which context 
the expression executes, and /(ui,...,Un) v for calling a built in function 
(like spawn for process spawning) with side-effects on the process level state. An 
example of an expression level transition rule is: 



furii 



/(i^) 



/(^) ^ ^ 



The transitional behaviors of Erlang systems are captured separated into two 

cases: (i) a single process constraining the behaviors of an Erlang expression as 

illustrated in the following rule for process spawning: 

spawn(modii/e, /, v) pzd' , ,/ / , 

e > e pid ^ pid 

spawning 

proc (e, pzd, g) — > proc pzd, g) \\ ipr oc (^module : f(v), pid' , eips"^ 

and (ii) the (parallel) composition of two Erlang systems into a single one ex- 
emplified in the following rule for interleaving: 

'nterlea e ^ wellformed{si || S2) 

Si II S2 ^ sT II S2 

where wellformed{s) requires that process identifiers of processes in s are unique. 
The system actions are computation steps r, input pidlv and output pidlv. 
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2.4 Verifying ODSs in EVT 

The Erlang Verification Tool (EVT for short) is a proof editing tool implementing 
the above described framework: providing a property specification language and 
an embedding of an operational semantics for Erlang, combined with a general 
proof system based on the classical first-order sequent calculus. 



F ::= 



PredicateDef ::= 
PropType ::= 
DefSymbol 



tt I ff I T = T I F /\ F I F => F I F \/ F I not F 
forall Var : Type . F | exists Var : Type . F 
AVar:Type.F | F T | T : F 
[Action] F I <Action>F 

Name : PropType DefSymbol F 
prop Type -> PropType 

- I = 



Fig. 1. The syntax of logic formulae and definitions 



The syntax of the specification logic of EVT is illustrated in Eigure 1. In 
addition to the usual connectives of predicate logic the (o;)E and [o;]E modalities 
are available with their usual meaning, defined by referring to the transition 
relations of the embedded operational semantics. The T : E construct expresses 
the proposition “T satisfies E” (or “T has type E”). In the following we will 
refer to a number of predefined types such as erlangValue (ground values), 
erlangExpression (expressions), erlangSystem (systems), erlangAction (ac- 
tions ranging over computation steps tau, output pid \v and input pid?v)^ etc. 
In the definition of a predicate (PredicateDef) the => symbol selects the greatest 
fixed point, the <= symbol the least fixed point, while = is for non- recursive 
definitions (shorthands). Recursive occurrences of predicates are only permitted 
under an even number of negations to ensure monotonicity. 

3 Verifying a Prototypical Open Distributed System 

Active data structures^ i.e., collections of processes that by coordinating their 
activities mimic in a concurrent way some data structure, are frequently used 
in telecommunication software. In a previous study [2] a protocol for respond- 
ing to database queries, directed to the distributed database manager Mnesia, 
was verified. Internally the protocol built up a ring like structure of connected 
processes in order to answer queries efficiently. In the current example we ex- 
amine a scheme for a set implementation inspired by a set-as-process example 
of Hoare [7]. Here the active data structure is a linked list, but the similarities 
with the database query example are striking. 
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3.1 An Implementation of a Persistent Set 

As an abstract mathematical notion, a set is simply a collection of objects (taken 
out of an universe of objects), characterized by the membership relation “g”: if 
s is an object and 5 is a set, then the statement 5 G 5 is either true or false. 
Using the membership relation, one can define sets as unions, intersections, or 
differences of other sets, or in other ways. 

Computer scientists have also another view of sets, namely as mutable ob- 
jects: a set, when manipulated by adding or removing elements, still keeps its 
“identity”, e.g. through an identifier. Any data-structure for manipulating col- 
lections of objects, which does not impose an order on its elements (i.e. hides 
this order through its interface), can be understood as implementing a set. 

The objects to be manipulated can be distributed in space, and if the objects 
themselves are large, it is conceivable, that we might want each object to be 
maintained by a separate process. A further reason for implementing a set as an 
active data structure is to permit concurrent access to multiple elements. 

A complete implementation of a set, without a possibility to remove elements, 
by means of a collection of interacting processes is given in Appendix A. 2, where 
a module persistent_set_adt is defined. Internally the module implements 
two functions - one for maintaining of single elements, and one for the empty 
set. A set is identified by an Erlang process identifier. When creating a new set, 
it initially consists of a single process executing the empty_set function; it is the 
process identifier of this process by which the set is to be identified from hence 
on. When an element is added, a new process is spawned off to store the element 
if it is not already present in the set. Internally, when a new element is added 
to a set, it is “pushed downwards” through the list of processes representing 
set elements, until it reaches the emptyset process, which spawns off another 
emptyset process, and becomes itself a process maintaining the new element. So, 
as a result, a set is implemented as a unidirectional linked collection of processes 
referenced by a process identifier. 

To encapsulate the set against improper use, we provide a controlled interface 
to the set module, consisting of a function for set creation mk_empty, testing for 
membership is_member, addition of elements add_ element, etc. The set creation 
function, for example, spawns off a process executing the empty_set function, 
and returns the process identifier of the newly spawned process. This process 
identifier has then to be provided as an argument to all the other interface func- 
tions. The implementation of the two set functions and the interface prevents 
the user of the set module from having to notice that sets are internally repre- 
sented by processes, and moreover prevents direct access to any other process 
identifiers created internal to the linked list of processes. 

Note however that any process, given knowledge of the process identifier of 
a persistent set, can choose to circumvent the interface functions and directly 
communicate (through message passing) with the set process. As we shall see in 
the proof such “protocol abuse” can lead to program errors. 
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3.2 A Persistent Set Property 

To check the correctness of a persistent set implementation, we have to specify 
those properties of sets which we consider paramount for correct behaviour. 
Ideally, one would like such a specification to be complete, i.e. a system should 
satisfy the specification exactly when it implements such a set. Completeness, 
however, is usually difficult to achieve in practice, since such a specification would 
be very detailed and the resulting proofs could easily become unmanageably 
complex. 

One crucial property of persistent sets is naturally that they retain any ele- 
ment added to them. For simplicity, we will here prove a simpler property, that 
once any element has been added to such a set the set will forever be non-empty. 
The main predicates are: 



ag_non_empty : erlangPid -> erlangSystem -> prop => 

\SetPid : erlangPid . \Set Sys : erlangSystem . 

((SetSys : non_empty SetPid) /\ 

(SetSys : forall Alpha: erlangAct ion. [Alpha] (ag_non_empty SetPid))); 

persistently _non_empty : erlangPid -> erlangSystem -> prop => 

\SetPid : erlangPid . \Set Sys : erlangSystem . 

(((SetSys : non_empty SetPid) /\ (SetSys : ag_non_empty SetPid)) \/ 
(SetSys : empty SetPid) /\ (SetSys : forall Alpha: erlangAct ion. 

[Alpha] (persistently_non_empty SetPid))) ; 



Intuitively the persistently_non_empty predicate expresses an automa- 
ton that, when applied to a process identifier SetPid and an Erlang sys- 
tem SetSys representing a set, checks that empty SetPid remains true until 
non_empty SetPid becomes true, after which non_empty SetPid must remain 
continuously true forever (definition ag_non_ empty). Note that this is, in some 
respect, a challenging property since it contains both a safety part {non-empty 
sets never claim to he empty) and a liveness part {all sets eventually answer 
queries whether they are empty). 

We advocate an observational approach to specification, through invocation 
of the interface functions, as evidenced in the definition of the empty predicate: 



empty: erlangPid -> erlangSystem -> prop = 

\SetPid : erlangPid . \Set Sys : erlangSystem . 

(forall Pid: erlangPid. 

((not (Pid = SetPid)) => 

(proc<is_empty (SetPid) , Pid, eps> || SetSys : (evaluates_to Pid true)))); 



The empty predicate expresses that proc<is_empty (SetPid) , Pid, eps>, 
an observer process, will eventually (in a finite number of steps) terminate with 
the value true, if executing concurrently with the observed set SetSys. For lack 
of space the definition of the evaluates_to predicate has been omitted. 



3.3 A Proof Sketch 

Expressed in the syntax of EVT the main proof obligation becomes: 



prove "declare P:erlangPid in |- proc<empty_set () , P, eps> : persistently_non_empty P"; 
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That is the Erlang system proc<empty_set () , P, eps>, an initially empty 
set, satisfies the persi stent ly_non_empty P property. In fact we will prove a 
slightly stronger property: 

Goal #0: not (add_in_queue Q) |- proc<empty_set () , P, Q> : persistently_non_empty P; 

where the not (add_in_queue Q) assumption expresses that the queue Q does 
not contain an add_ element message. This proof goal is reduced by unfolding 
the definition of the persistently_non_empty predicate, choosing to show that 
the set process will signal that it is empty when queried, and performing a few 
other trivial proof steps. There are two resulting proof goals: 



#1 : not (add_in_queue Q) |- proc<empty_set () , P, Q> : empty P 
#2: not (add_in_queue Q) |- proc<empty_set () , P, Q> : 

forall Alpha :erlangAct ion. [Alpha] (persistently_non_empty P) 



Goal #1 reduces to (after unfolding empty and rewriting): 

not (add_in_queue Q) , not(P=P’) |- proc<is_empty (P) , P’, eps> || proc<empty_set () , P, Q> : 

evaluates_to P’ true 



That is, an observer process calling the interface routine is .empty with the 
set process identifier P as argument will eventually (in a finite number of steps) 
evaluate to the value true (meaning that the set is empty). Here the proof 
strategy is to symbolically “execute” the two processes together with the formula, 
and observe that in all possible future states the observer process terminates with 
true as the result. Note however that the assumption not (add_in_queue Q) is 
crucial due to the Erlang semantics of queue handling. If the queue Q contains an 
add. element message the observer process will instead return false as a result, 
since its is. empty message would be stored after the add.element message in 
the queue and thus be serviced only after an element is added to the set. 

The second proof goal #2 is reduced by eliminating the universal quantifier, 
and computing the next state under all possible types of actions. Since the 
process is unable to perform an output action there are two resulting goals, one 
which corresponds to the input of a message V (note the resulting queue Q@V) 
and the second a computation step (applying the empty.set function). 

#3: not (add_in_queue Q) |- proc<empty_set () , P, Q@V> : persistently_non_empty P 
#4: not (add_in_queue Q) |- proc<receive ... end, P, Q> : persistently_non_empty P 

Proceeding with goal #4 either the first message to be read from the queue 
is is. empty or is.member (the possibility of an add.element message can be 
discarded due to the queue assumption). Handling these two new goals presents 
no major difficulties. 

Goal #3 is reduced by analysing the value of V. If it is not an add.element 
message then we can easily extend the assumption about non-emptiness of Q: 



#5: not (add_in_queue Q@V) |- proc<empty_set () , P, Q@V> : persistently_non_empty P 
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Goal #5 is clearly an instance of goal #0, i.e., we can find a substitution of 
variables that when applied to the original goal will result in the current proof 
goal (the identity substitution except that it maps the queue Q to the queue 
Q@V). Since we have at the same time unfolded a greatest fixed point on the 
right hand side of the turnstile (the definition of persistently_non_empty) we 
are allowed to discharge the current proof goal at this point. If, on the other 
hand, V is an add_element message the next goal becomes: 

#6: add_in_queue Q@V |- proc<empty_set () , P, Q@V> : persistently_non_empty P 

At this point we cannot discharge the proof goal, since there is no substitution 
from the original proof goal to the current one. Instead we repeat the steps of 
the proof of goal #0 but taking care to show non_ empty P instead of empty P. 
Also, we cannot discard the possibility of receiving an add_ element message and 
the resulting goal is (after weakening out the queue assumption): 

#7: |- proc< set (Element ,mk_empty (...)) ,P , Q ’ > : ag_non_empty P 

By repeating the above pattern of reasoning with regards to goal #7 we 
eventually reach the proof state: 

#8: not(P=P’) |- proc<set (Element ,P’ ) ,P,Q’ ’> || proc<empty_set ,P’ ,eps> : ag_non_empty P 

The Erlang components of the proof states of the proof, up to the point of 
the spawning off of the new process, are illustrated in Figure 2. 



proc<empty_set 0 ,P,Q1> 



P’ ’ ! {is^mpty, false} 



proc<receive . . .end,P,Q2> 



P’ ’ ! {isjnember, . . . , false} 



V = {is -empty , P } 

proc<Client ! {is_empty ,true} , . . . ,P,Q3> 



V = {is-member, P' } 

proc<Client ! {isjnember , Element , false } , . . . ,P,Q3> 

V = {add-element, ...} 

proc<set (Element , mk_empty ( ) ) , P , Q3> 



proc<set (Element , spawn ( . . . ) ) , P , Q4> 



proc<set (Element ,P ’) ,P , Q5> II proc<empty_set() ,P’ ,eps> 



Fig. 2. Erlang components of initial proof states 



At this point we have reached a critical point in the proof where some man- 
ual decision is required. Clearly we can repeat the above proof steps forever. 
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never being able to discharge all proof goals, due to the possibility of spawning 
new processes. Instead we apply the term-cut proof rule, to abstract the freshly 
spawned processes with a formula psi ending up with two new proof goals: 

#9: not(P=P’) |- proc<empty_set 0 , P’, eps> : psi P P’ 

#10: not(P=P’), X:psi P P’ |- proc<set (Element,P’) , P, Q’’> II X : ag_non_empty P 

How should we choose psi? The cut formula must be expressive enough to 
characterise the proc<empty_set () , P\ eps> process, in the context of the 
second process and for the purpose of proving the formula ag_non_ empty P. 
Here it turns out that the following formula is sufficient: 

psi: erlangPid -> erlangPid -> erlangSystem -> prop => 

\P:erlangPid . \P ’: erlangPid . 

( (forall P:erlangPid . forall V : erlangValue . [P?V] ( (not (is_empty V)) => psi P P’)) 

/\ (forall P:erlangPid . forall V : erlangValue . [P!V] (not (is_empty V))) 

/\ (converges P P’) /\ (foreign P) /\ (local P’)); 



Intuitively psi expresses: 

— Whenever a new message is received, and it is not an is .empty message 
then psi continues to hold. 

— An is .empty reply is never issued. 

— The predicated system can only perform a finite number number of internal 
and output steps (definition of converges omitted). 

— Process identifier P is foreign (does not belong to any process in the predi- 
cated system) and process identifier P^ is local (definitions omitted). 

The proof of goal #9 is straightforward up to reaching the goal: 

#11: not(P’=P’’) |- proc<set (Element ,P’ ’) , P’, || proc<empty_set () , P’’, eps> : psi P P’ 

Here we once again apply the term-cut rule to obtain the following goals: 

#12: not(P’=P’’) |- proc<empty_set 0 , P’ ’ , eps> : psi P’ P’’ 

#13: Y:psi P’ P’’ |- proc<set (Element ,P’ ’ ) ,P’ , Q ’ ’ ’ > I I Y : psi P P’ 

Goal #12 can be discharged immediately due to the fact that it is 
an instance of goal #9. Goal #13 involves symbolically executing the 
proc<set (Element ,P^G,PAQ^^^> process together with the (abstracted) pro- 
cess variable Y, thus generating their combined proof state space. Since both 
these systems generate finite proof state spaces this construction will eventually 
terminate. The proof of goal #10 is highly similar to the proof of goal #13 above, 
and is omitted for lack of space. 

3.4 A Discussion of the Proof 

The proof itself represented a serious challenge in several respects: 

— The modelled system is an open one in which at any time additional set 
elements can be added outside of the control of the set implementation itself. 
The state space of the set implementation is clearly non finite state: both 
the number of processes and the size of message queues can potentially grow 
without bound. 
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— The queue semantics of Erlang has some curious effects with regards to 
an observer interacting with the set implementation. It is for instance not 
sufficient to consider only the program states of the set process to determine 
whether an observer will recognise a set to be empty or not; also the contents 
of the input message queue of the set process has to be taken into account. 

Although the correctness of the program may at first glance appear obvi- 
ous, a closer inspection of the source code through the process of proving the 
implementation correct revealed a number of problems. 

For instance, in an earlier version of the set module the guards pid (Client) 
in the empty_set and set functions were missing. These guards serve to ensure 
that any received is .empty or is .member message must contain a valid process 
identifier. Should these guards be removed a set process will terminate due to a 
runtime (typing) error if, say, a message {is. empty ,21} is sent to it. 

In most languages adding such guards would not be needed since usage of 
the interface functions should ensure that these kinds of “typing errors” can 
never take place. In Erlang, in contrast, it is perfectly possible to circumvent the 
interface functions and communicate directly with the set implementation. 

4 Conclusion 

We have introduced an ambitious proof system based verification framework 
that enables formal reasoning about complex open distributed systems (ODSs) 
as programmed in real programming languages like Erlang. The proof method 
was illustrated on a prototypical example where a set library was implemented as 
a process structure, and verified with respect to a particular correctness property 
formulated in an expressive specification logic. Parts of the proof were checked 
using the Erlang Verification Tool, a proof editing tool with specific knowledge 
about Erlang syntax and operational semantics. 

In conclusion we have clearly illustrated the great potential of the approach: 
we were able to verify non-trivial properties of real code in spite of difficulties 
such as the essentially non-finite state nature of the class of open distributed 
systems studied in the example. In addition the approach is promising because 
of its generality: we are certainly not tied for all future to the currently studied 
programming language (Erlang) but can, by providing alternative operational 
semantics, easily target other programming languages. Still numerous improve- 
ments of the framework are necessary, perhaps at the moment most importantly 
with respect to the interaction with the proof editing tool. We are currently 
forced to reason at a detail level where too many manual proof steps are re- 
quired to complete proofs. How to rectify this situation by providing high-level 
automated proof tactics remains an area of active research. 
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A Appendix 

A.l A Short Introduction to Erlang 

An Erlang process, here written proc {e,pid,q), is a container for evaluation of 
functional expressions e, that can potentially have side effects (e.g., communi- 
cate). A process has a unique process identifier [pid) which is used to identify 
the recipient process in communications. Communication is always binary, with 
one (anonymous) party sending a message (a value) to a second party identified 
by its process identifier. Messages sent to a process are put in its mailbox q, 
queued in arriving order. The empty queue will be denoted with eps, and q(^v is 
a queue composed of the subqueue q and the value v. To express the concurrent 
execution of two sets of processes s\ and S2, the syntax s\ \\ S2 is used. 

The functional sublanguage of Erlang is rather standard: atoms, integers, lists 
and tuples are value constructors; 61(02) is a function call; 61,62 is sequential 
composition; case e of pi[when 6i^]->ei; . . . ;pn[^hen cig]->en end is matching: 
the value that e evaluates to is matched sequentially against patterns (values that 
may contain unbound variables) Pi, respecting the optional guard expressions 
Cig. ei\c 2 is sending whereas receive m end inspects the process mailbox q and 
retrieves (and removes) the first element in q that matches any pattern in m. 
Once such an element v has been found, evaluation proceeds analogously to case 
V of m. Einally if 6i->e'^; • • • 6n->e^ end is sequential choice. 
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Expressions are interpreted relative to an environment of “user defined” func- 
tion definitions f{pi) ~>ei; • • • ; f{pk) The Erlang functions are placed in 

modules using the -module construct. Eunctions not explicitly exported using 
the -export construct are unavailable outside the body of the module. 

A number of builtin functions are used in the paper, self returns the pid of 
the current process, spawn (ei, 62, 63) creates a new process, with empty mailbox, 
executing the expression 62(03) in the context of module ei. The pid of the new 
process is returned, pid(e) evaluates to true if e is a pid and false otherwise. 
Syntactical equality (inequality) is checked by == (/=). 

A. 2 The Set Erlang Module 

-module (persistent_set_adt) . 

-export ( [mk_empty/0, is_empty/l, is_member/2, add_element/2 , empty_set/0] ) . 

empty_set () -> 
receive 

{is_empty, Client} when pid(Client) -> 

Client ! {is_empty, true}, empty_set () ; 

{is_member, Element, Client} when pid(Client) -> 

Client ! {is_member. Element, false}, empty_set () ; 

{add_element , Element}-> 

set (Element, mk_empty ()) 

end. 

set (Element, Set) -> 
receive 

{is_empty. Client} when pid(Client) -> 

Client ! {is_empty, false}, set (Element, Set); 

{is_member, SomeElement , Client} when pid(Client) -> 
if 

SomeElement == Element -> 

Client ! {is_member, SomeElement, true}, set (Element, Set); 

SomeElement /= Element -> 

Set ! -Cis_member, SomeElement, Client}, set (Element, Set) 

end; 

{add_element , SomeElement} -> 
if 

SomeElement == Element -> 
set (Element, Set); 

SomeElement /= Element -> 

Set ! {add_element , SomeElement}, set (Element, Set) 

end 

end. 

y.y. MODULE INTERFACE FUNCTIONS 

mk_empty () -> spawn (persistent_set_adt , empty_set , [] ) . 

is_empty (Set) -> 

Set ! -Cis_empty, self ()}, 
receive 

{is_empty. Value} -> Value 

end. 

is_member (Element, Set) -> 

Set ! -Cis_member, Element, self ()}, 
receive 

{is_member. Element, Value} -> Value 

end. 



add_element (Element, Set) -> Set ! -Cadd_element , Element}. 
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Abstract. In this paper, we present a new system for proof-search in 
propositional intuitionistic logic from which an efficient implementation 
based on structural sharing is naturally derived. The way to solve the 
problem of formula duplication is not based on logical solutions but on an 
appropriate representation of sequents with a direct impact on sharing 
and therefore on the implementation. Then, the proof-search is based on 
a finer control of the resources and has a (P(n log n)-space complexity. 
The system has the subformula property and leads to an algorithm that, 
for a given sequent, constructs a proof or generates a counter-model. 



1 Introduction 

In recent years there was a renewed interest in proof-search for constructive 
logics like intuitionistic logic (IL), mainly because of research in intuitionistic 
type theories and their relationships with programming through proof-search [7]. 
Nowadays, theorem provers are more often used into the formal development of 
software and systems and have to be integrated into various environments and 
large applications. Then, to provide a simple implementation of a logical calculus 
(and connected proof-search methods) in imperative programming languages 
like C or Java, we have to consider new techniques leading to the clarity of 
the implementation [8] but also to a good and efficient explicit management of 
formulae, considered here as resources. The potential support for user interfaces 
or for a natural parallelisation of the proof-search process could be expected. 

A recent work on efficient implementation of a theorem prover for Intuitionistic 
Propositional Logic (IPL) has emphasised the interest of finer control and man- 
agement of the formulae involved in the proof-search, with an encoding of the 
sequents at the implementation step [2]. But to solve the problem of duplication 
of some formulae we need to reconsider the global problem of proof-search in this 
setting, knowing that a usable proposal is necessary for a good space complexity 
and an efficient implementation based on resources sharing. A logical solution 
based on the introduction of new variables and of some adequate strategies is 
given in [3]. Our goal in this paper is to propose a solution at a structural level 
(no new formulae and only modifications of the sequents structure) including 
a finer management of formulae (as resources). Therefore, we define, from an 
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analysis of previous works, a new logical system, that has some similarities with 
the one of [3]. With such a system we can naturally derive, from the logical rules 
and a particular representation of sequents (with specific tree structures), some 
new rules that act on trees to perform proof-search. Moreover, these ideas can 
also be applied to a refutability system for IPL to generate counter-models in 
case of unprovability [5]. With such an approach the depth of the search tree 
is in both systems linearly bounded and the control on resources leads to an 
efficient implementation in imperative languages. 

2 Proof-Search in Intuitionistic Logic 

We focus on IPL for which the LJ sequent calculus is a simple formulation 
having the subformula property. On the other hand, it lacks termination and thus 
implementing this calculus involves some kind of loop-checking mechanism to 
ensure the termination of computation. A solution is the use of the contraction- 
free sequent calculus LJT [1] that has the nice property that backward search 
does not loop. There exists other proposals for proof-search in IL based for 
instance on resolution [7], constraints [9] or skolemization [6]. 

Let us start to recall the main characteristics of the LJT system. It is is sound 
and complete w.r.t. provability in LJ and a backward search does not loop and 
always terminates [1]. For that, the left-implication rule of LJ is replaced by four 
rules depending on the form of the premiss of the principal formula. Then, the 
treatment of the implication is becoming finer and the termination is obtained 
from a well-founded ordering on sequents. Moreover the contraction rule is no 
longer a primitive rule but is nevertheless admissible. The corresponding multi- 
conclusion calculus is also a good basis for a non-looping method to construct 
Kripke trees as counter- models for non-provable formulae of IPL [5]. It provides 
a calculus, called CRIP, in which refutations are inductively defined and Kripke 
counter- models can be extracted from these refutations. The rules of LJT are 
given in figure 1, X representing atomic formulae. A, 5, G representing formulae 
and r being a multiset of formulae. This system provides a decision procedure 
in which the deductions (proofs) may be of exponential depth in the size of their 
endsequent. Moreover in rules like ^-Ls, some formulae of the conclusion appear 
twice (duplication problem). We also observe that, for instance in the ^-Ls rule, 
the premises are not smaller (in the usual meaning) than the conclusion, even 
if they are with a multiset ordering (see [1]). The LJT prover, implemented in 
Prolog, is based on an analytic approach where the size of the formulae decreases 
but a potential big number of subformulae might appear. One one has no explicit 
knowledge on the formulae used during the proof-search or on the depth of proofs. 
Moreover, LJT does not satisfy the subformula property: for instance, in rule 
^-L 4 , the formula B ^ C is introduced in the left premise even if it is not a 
sub formula of {A^B)^C . Some rules duplicate formulae as in rule ^-Ls where 
C is duplicated and in rule ^-L 4 where B is duplicated. Therefore, the idea to 
share formulae naturally arises so that the duplication would be harmless in 
terms of space. But we have to know which are the formulae that might appear 
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Fig. 1. The LJT calculus 



during such backward proof-search. Then we have recently proposed an explicit 
management of the formulae (considered as resources) and a new proof-search 
method for LJT based on new concepts and results such as a new notion of 
sub formula [2]. 

In the LG system, independently proposed in [3], the length of the deductions is 
linear bounded in terms of the size of the endsequent. A backward application 
of the rules gives rise to an 0(n log n)-space decision algorithm for IPL. In this 
work, the strategy used to avoid the duplication of sub formulae is different for 
the ^-Ls and rules: for the first one, it corresponds to add new proposi- 

tional variables and to replace complex formulae by such new variables. For the 
second one, it forces the sequent s to keep a convenient form in order to apply 
some transformations or reductions. In fact, in this system, the duplication prob- 
lem is solved at a logical level by the introduction of new variables and the use of 
particular sequents [3]. But it is complicated and several implementation prob- 
lems arise. Our aim, in this paper, is to propose an alternative (not logical but 
structural) solution based on a fine management of new structures appropriate 
for an efficient implementation. 

Let us mention the work of Weich [10] that is based on constructive proofs of the 
decidability of IPL by simultaneously constructing either a proof, taking up the 
approach of [3] or a counter-model refining the idea of [4]. His algorithm is faster 
than the refutability algorithm of [5]. When restricted to provability it coincides 
with the one of [3] including the same logical solution for the duplication problem. 
Our proposal is based on a new logical system that has some similarities with 
the formulations of LG dedicated to the implementation [3]. The duplication 
problem is in our case treated by a structural approach in which no new vari- 
able or formula is introduced and such that we only modify the structure of 
the sequents. It will lead to a more direct and clear implementation in imper- 
ative languages. The main characteristics coming from the use of appropriate 
sharing techniques are: no dynamic memory allocation, a finer control on the 
resources and a 0(n log n)-space algorithm for provability. A similar approach 
can be applied to define and then implement an algorithm for refutability with 
the generation of counter- mo dels in a same space complexity, i.e., with the depth 
of search tree being linearly bounded. 




104 Didier Galmiche and Dominique Larchey-Wendling 



r*,A,B h I 
r*,AABh\ 

r\x,c hm 



[Al] 

- 



r\Ah\ 



r*,B h\ 



r*,x, X ^ c h I 

r,A,B*^chm r*,ch\ 

r\(A^B)^chm 

r-^,A^{B ^C)h\ 



r*,{AAB)^chm 

r*,A^C,B^ChM 
r\{Av B)^ChM 

r,x h X 
r,A,BhG 



[A- 

[V- 



r, A AB h G 
r, X, a h G 



[Ai 



r,X,X^GhG 
r,A^{B^G)hG 



r,{AAB) ^Gh G 

r h A r h B 
r \- aa B 

B Ah B 



[A- 

[Ar] 



r*,AV Bh§ 

r, X, X* ^ G h I 
r, A, ^Ghl 



[Vi 



[X* 



r,(A^ By ^G hm ^ 

r, A*^Ghi r,B*^G hi 
r,(AABy ^Ghm 

y A* ^ G, B ^ G h \ 



[A* 



r,(Av By ^G hm 

yA^G,B*^Ghm 
r,(Av By ^ G hm 

r, Ah G r,BhG 



K- 

[vD 



yAVBhG 
r,A,B-^^Ghm r, GhG 



r,(A ^ B) ^ G h G 

yA^G,B^GhG 
r,{AV B) ^ G h G 



[V- 



rh A^B 



rh A 
r h AV B 

rh B 
r h AV B 



[Vi 



[Vi 



Fig. 2. The SLJ-system 



3 A New System for Intuitionistic Proof-Search 

We present in this section a new logical system, called SLJ, that leads to a natural 
use of sharing techniques on structures adapted to an efficient implementation 
in imperative languages. It is derived from the ideas of LJT [1] and of LG [3]. 

3.1 The SLJ System 

The SLJ-system, that is given in figure 2, includes two kinds of sequents, like 
in [3]. We have the usual intuitionistic sequent F \~ A which is a multiset F 
of formulae forming the hypothesis together with a unique formula A called 
the conclusion. The intuitive meaning of a sequent is that the conclusion A 
logically follows from the hypothesis in T. Moreover, we have a second kind of 
sequent which we will call boxed sequent: F, a'^ ^ C \~ M which intuitively means 
F^a ^ C \- a. The special ^-notation indicates that the conclusion a is shared 
with the left side of an implication in the hypothesis: a ^ C. For example, in 
rule B is shared and not duplicated. There is a mark a to determine 

which formula is concerned. In this second kind of sequents, there is exactly 
one marked formula on the left of an implication of the hypothesis. We will 
denote F^ to point out the fact that F contains exactly one formula of type 
C. This will happen when this last formula is not the active one like in 
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rule [X^] for example. This idea of marking a was introduced in [3] to avoid 
one particular kind of duplication of formula in LJT and is also useful for our 
purpose. It is important to notice that when we try to prove a boxed sequent 
^ C ^ the sequents encountered upper in the proof are always boxed: 
only boxed sequents appear in the proof of an initial boxed sequent. 

The reader may also notice that the rules [Al], [Vl], [X^]^ 

[V^] are duplicated. There is one version for boxed sequents and another version 
for standard sequents. In the case of boxed sequents, the marked (or starred) 
formula is not active. We have chosen not to distinguish the names because only 
the nature of the sequent (boxed or not) differs for each pair of rules. On the 
contrary, if we compare the rules and then it appears that they 

are very different. In the case the marked formula is active, the rules do not have 
the same meaning at all. 

This system is very interesting because all the duplications appearing in LJT 
have been removed except in the rules [V^], [V^^] and The problem 

of duplication in the rule (^-L 4 ) of LJT has then disappeared. We will see 
in section 4 that the last duplication cases can be addressed by changing the 
structure of sequent to represent sharing. 

3.2 Kripke Models 

Kripke models are a very general model representation for logics like intuit ionistic 
or modal logics. Here, we will adopt the same notation as in [10] for Kripke trees. 
A Kripke tree is a pair K = [/Ci, . . . , Kp\) where Sjc is a finite set of logical 

variables and [/Ci, . . . , /Cp] is a finite list of Kripke trees. Moreover, we suppose 
that for each Sjc ^ Sjc-. This monotony condition is typical to intuitionistic 
logic as opposed to modal logics for example. This is an inductive definition of 
Kripke trees and the base case is when the list [/Ci, . . . , Xp] is empty, i.e. p = 0. 
In fact, Kripke trees are regular (oriented) trees with each node tagged with a 
set of variables that grows along the paths from the root to the leaves. 

We then introduce the notion of foreing. Let X = (5^:, [/Ci, . . . , Xp]) be a Kripke 
tree and F be a logical formula. We say that X forces F and write /C 1= F. The 
inductive definition is the following: 

X^ X m X ^ Sic FkA^FiffFkA implies F k F and Vi, Fi k A ^ F 
FkAAF iff FkA and FkF FkAVF iff FkA or FkF 

The only difference with classical logic here is that the forcing of logical impli- 
cation A ^ F is inherited in the sons Xi of F and so in all its subtrees. In fact, 
this inheritance is extended to all formulae with the following result, a proof of 
which can be found in [10]. 

Lemma 1. If F is a logieal formula and X = (Fx:, [/Ci, . . . , Xp]) is a Kripke 
tree foreing F , i.e. F k F, then for all i, Xi^ F. 

Kripke trees are models of intuitionistic logic. Let Ai, . . . , A^ k F be a sequent, 
we say that a Kripke tree F is a model of Ai, . . . , A^ k F if F k Ai and . . . and 
F k An implies F k F. We will often write F k Ai, . . . , A^ k F. 
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Theorem 2 (Soundness). If F \- A is provable in intuitionistic logic and JC 
is a Kripke tree then /C 1= T h A. 

Models are very often used in a negative way to provide a synthetic argument 
to the unprovability of some formula or sequent. Therefore, we say that /C is a 
counter- model of the sequent Ai, . . . , An \~ B if JC^ Ai and . . . and K N An and 
We write /C Ai, . . . ^ An b B. There is an constructive version of the 
completeness theorem, see [10] for example. 

Theorem 3 (Completeness). Let F 'r A be a sequent, then either it is intu- 
itionistically provable, or there exists a Kripke tree K such that KA^ F \~ A. 



3.3 Completeness of SLJ 

Before proving the theorem we have to precise how Kripke trees can model 
boxed sequents. /C is a counter-model to T, o;^ ^ C h ■ if /C is a counter-model 
to F,a^C \- a, 80 IC^ F,a^C and But we do not know whether JC forces 

C or not at the level of 1C. However, if one of its sons Ki forces o; then Ki also 
forces C. The completeness result for SLJ can be decomposed into two parts: 
one for boxed sequents and one for standard sequents. Here we only prove the 
part for boxed sequents. The other part is very similar. The reader is reminded 
that the following proofs are inspired from [5,10]. 

Theorem 4 (Soundness). If F,a^ ^ M is a provable sequent in SLJ, then 
F,a ^ C \- a is intuitionistically provable (or provable in LIT). 

To obtain this result, it is sufficient to show that all the boxed rules^ of SLJ are 
admissible. For example, let us explore the case of [A"*"^]. We have to prove that 
the following rule is admissible in IPL: 

F,A^CC A F,B^C\- B 
F,{AAB)^C\- AAB 

From the premisses, we obtain F \~ (A ^ C) ^ A and F \~ (B ^ C) ^ B. The 
sequent {A ^ C) ^ A, {B ^ C) ^ B, {A A B) ^ C \~ A A B is also provable. By 
two applications of the (Cut) rule, we obtain F,F,{A A B) ^ C h A A B. We 
finally get the conclusion by application of the contraction rule. More details can 
be found in [3]. 

Theorem 5 (Completeness). Let F, ^ C \~ M be a boxed sequent, either it 
admits a proof in SLJ, or it admits a Kripke tree counter-model. 

The proof is done by induction on F,a^C, but it is not a simple induction on 
the size of F,a ^ C, see [1]. First, we point out the fact that some rules have 
the important invertibility property: any counter-model to one of the premisses 
is a counter- model to the conclusion. This is the case for the [Al], [Vl], 

Boxed rules are the rules of figure 2 for which the conclusion is a boxed sequent. 
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[A^], and [V^] rules. Then if F has one of the following 

form, 

r = A,AAB r = A,A\/B F = A,X,X^C 

r = A,{AV B)^C r = A,{AAB)^C 

one of the above rule can be applied. The induction hypothesis gives us either 
one proof for each premiss or else one counter-model to one of the premisses, 
and thus we either have a proof or a counter model of T, 0 ;"*" ^ C h This is 
also the case when a = A^B^a = AAB 01 a = X and X G F. Then the 
problem is reduced to the following case: 

_ J Xi , . . . , Xn , Yi ^ , . . . ,Yk ^ Dk, 

-\{Ai^Bi)^Ci,...,{Ap^Bp)^Cp 

with {Xi, . . . , Xn}n{Yi, . . . , Yk} = 0 . Moreover, either o; = A\/B, or a = X and 
X ^ {^ 1 ^ • • • , ^ 1 , • • • , yk}- We will suppose that a = A\/ B ior the rest of 

the proof — the other case is simpler. We introduce Ai = F — }{Ai ^ Bi) ^ Ci} . 
Then, by induction hypothesis, either one of the Ai^ Ai^ B^ ^ Ci, a ^ C ^ M is 
provable, or there exists a counter-model for each of them. 

In the first case, suppose that Ai ^ , Ai ^ , 57 ^ C \~ M has a proof in SLJ. 

Then consider the rule 

AiQ , AiQ , B^q ^ CiQ,a ^ C \- M AiQ , Gq , W ^ C h ■ 

r,a* ^c\-m 

This rule is right invertible: a counter-model of the second premiss is a counter- 
model of the conclusion. On the other hand, a proof of the second premiss would 
give a proof of T, 0 ;"*" ^ C h Thus, we obtain a proof or a counter- model of 
the conclusion. 

In the second case, for each i G [l,p] let Xi be a counter- model of Ai^Ai^B^ 
Ci,a^ C \- M. Let us consider o; = A V 5.^ By induction hypothesis, either one 
sequent among F,A*^C^B^C\-M and F,A^C^B*^C\-Mi8 provable, 
or we have two counter- models Xa and Xb- In the first subcase, we obtain a 
proof of T, {A V By ^ ^ ^ ^ applying [V^^] or In the second subcase, 

we define X = ({Xi, . . . , X^}, [Xi, . . . , Xp, Xa, X^]). It is only routine to check 
that X is a counter- model to T, 0 ;"*" ^ C h We now give some details of the 
proof. As Xj G zAi, we get Xi 1= Xj and so Xj G S]c^ for any j G [l,n] and 
i G {!,••• ,p, A,5}. Thus, X is a Kripke tree and it is simple to check that 
XNXi,... ,Xn,Ti^5i,... 

As Xa y A and Xb^ we obtain X^ a and then it remains to prove that 

X 1= {Aj Bj) Cj hold. For any i G {1, . . . ,p. A, 5} — {j}, {Aj Bj) Cj is 
in the context (Ai or F) and so Xi^{Aj^Bj)^Cj. Why does Xj^{Aj^Bj)^Cj 
hold? Because we have Xj 1= Aj and Xj 1= Bj Cj . Moreover Xj Bj and also 
Xj y Aj Bj and thus X^ Aj ^ Bj . Finally, if X 1= Aj Bj then X^ Cj. 

^ Remember we have supposed this but a complete proof would also consider the case 
a = XandX^{Xi,... ,Xn,Ti,... 
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This completeness proof for the boxed fragment of SLJ also describes an algo- 
rithm that produces a proof or a counter- model out of a given (boxed) sequent. 
It can be effectively used to build counter-models. 



4 Sharing Techniques 

In this section, we will present an alternative approach, w.r.t. [3], to address 
the problem of duplication of formulae during proof-search. It is not logical but 
structural: only the structure of sequents is modified, no new formula or variable 
is introduced. Looking at the SLJ-system, we can observe that a duplication 
occurs when the active formula is of type {A\f B) ^ and only in this case, as 
for example in the rule [V^]. Hudelmaier proposed a way to bypass this problem 
for complexity reasons: he wanted the linear size of sequent to decrease strictly 
while applying rules of his system and put a restriction on C that is to be an 
atomic formula, a variable for example. But doing this, he had to introduce a 
new rule in the system so that any formula C could be replaced by a variable 
without modifying the meaning of the sequent: 

\X new variable! 

By this trick, the system assures that any duplication is done on a formula of 
size 1 and then has the property that the application of rules decreases the linear 
size of the sequent. That is why we say that the duplication problem is addressed 
on the logical side. We can see that the variable X represents the sharing of C 
between A^ C and B ^ C hy combining the two rules: 

r,A^X,B^X,X^C^G 

[V^j 

r,{AW B)^X,X ^G^G 

[X new variable] 

r, (T V B) ^ C h G 

But this method has the drawback of introducing some new formulae (A^X for 
example) during the proof-search. This could be a problem when we look for a 
resource-conscious implementation of proof-search. It may not be easy to allocate 
in advance all the formulae that could appear during the proof-search process. 
We propose an alternative approach that consists in encoding the sharing of C 
between A^ C and B ^ C inside the sequent structure. 



4.1 A Structural Solution 

Usually, an intuitionistic sequent is a multiset (or set or list, depending on struc- 
tural rules) of formulae, as hypothesis, together with a unique formula for the 
conclusion. Here, we will consider a new structure encoding the sharing. 
Sequent structure. The left part of sequents will have the structure of a forest 
(i.e. list of trees). Each tree represents a compound formula where sharing is 
taken into account. For this, the nodes of the trees are associated to subformulae 
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of the initial sequent which will be called tags. Let I be a leaf in this tree, we 
denote by Fq, . . . , F/ the list of formulae encountered while following the path 
from the root of tag Fq to the leaf I tagged with F/. 

Each tree T is associated a meaning^ which 
is the logical formula /\^ F/ ^ • ^Fq where 

I ranges over the leaves of T. As usual, ^ 

is right associative and Fi^ > Fq means 

Fi^ {• • • ^ Fq). See the example in figure 3. 

There is an obvious map from formulae to 
trees F i — ^ • F that associates a formula F Meaning function 

to one-node tree whose root is tagged with F. 

Rules. The rules are derived from the SLJ rules and the preceding meaning 
maps. They act on the leaves of the trees and there are two cases: the tree is a 
one-node tree or else the root is not a leaf. 




In the case of a one-node tree, ei- 
ther the formula is an implication 
in which case the node is rewritten 
as a two-nodes tree (rule [^l] in 
figure 5) or we apply left conjunc- 
tion or disjunction rule to obtain 
one or two one- node trees. 

In the case the root is not a leaf, 
a formula F/ ^ (F/_i ^ ^ Fq) 

corresponds to the leaf I and thus 
to an left implication rule of SLJ. 
For example, if F/ is AaB the leaf 
I is replaced by a two-nodes tree 
whose leaf is tagged with A and 



R 




A\J B 

Fig. 4. Tree decomposition rule 



whose root is tagged with B like it appears in rule [A^] of figure 5. 



We focus here on one of the rules that involves duplication. Considering the trees 
of figure 4, we see that the tree of the conclusion is associated to a formula of 
the shape • • • A [{A \J B) ^ {C ^ ^ F)] A • • • . Considering the rule [V^] 

of SLJ, and the fact that A can be exchanged with commas (,) on the left 
side of SLJ-sequents because of the rule [Al], we can convert this formula into 
• • • A [a ^ (C ^ ^ F) A F ^ (C ^ ^ F)] A • • • . The new formula exactly 

corresponds to the tree of the premise. It illustrates the method from which 
we derive the tree manipulation rules from SLJ-rules. We also notice how the 
duplication has been removed with the help of sharing. The node tagged with C 
is shared in the two paths leading to A and B. Finally, A and B which replace 
A\f B are sub formulae. This fact is a general one, and all rules introduce only 
sub formulae so that the tags of the trees are all sub formulae of the formulae of 
the initial sequent. The it provides a subformula property (see section 4.2). 



It is important to point out that the sharing version of SLJ is not a simple 
implementation in terms of trees. In particular, we have previously seen that 
proving a boxed sequent only involves boxed sequents. This is no longer the case 
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Fig. 5. An example of proof-search 



in the sharing tree version. The tree version of the rule [X^] might remove a 
whole subtree and this one might contain the marked (or starred) formula. In 
this case, the premiss cannot be a boxed sequent and we must reintroduce the 
marked formula as the conclusion of the premiss. 

An example. Considering figure 5, we aim to prove the sequent {{AaB) V (C^ 
D)) ^ E \- F. First we build the tree of the subformulae of this sequent. The 
nodes are given with numbers used to label the nodes of proof trees. The proof- 
search tree is written bottom-up as usual. We start by translating the premiss 
into a forest (in this case, there is only one tree) and then we develop this forest 
following the rules for trees. Notice that only the leaves of the trees are concerned 
in those rules. Inner nodes are “asleep” until all the sons are “removed.” This 
point has some consequences when efficient implementation is taken into account 
(see section 5). Moreover a marked formula G"*" appears in the tree at the end of 
the proof- search. As for SLJ, there might be at most one marked formula in the 
forest, meaning that this formula is shared with the conclusion and in this case, 
the conclusion is boxed. 



4.2 Properties of Structural Sharing 

Structural sharing avoids the duplication of formulae. In fact, anytime a formula 
on a leaf is decomposed, it is always replaced by one or two leaves that are 
tagged with the subformulae of the tagging formula. Sometimes a part of the 
tree can be removed. Let us consider some results about complexity, subformula 
property and counter-models. If we measure the size of the forest by the sum of 
the size of each tagging formula, then the size of any premise of a rule is strictly 
smaller than the size of the conclusion. Moreover, the size of the initial forest is 
exactly the size of initial sequent in the usual sense. 

Theorem 6. The depth of the proof-seareh tree is hounded by the size of the 
sequent (or forest) to prove. 
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This result is important and already obtained in [3], providing an O(nlogn)- 
space decision procedure for intuitionistic logic. Here we focus on an efficient 
implementation of such a procedure, based on structural sharing. Another im- 
portant point is that a tagging formula is always replaced by some of its direct 
sub formulae. 

Theorem 7. The tagging formulae occurring during the proof-search, as the 
forest gets decomposed, are subformulae of the initial sequent (or forest). 

The consequence for implementation is significant. In fact, an occurrence of a 
formula A is always decomposed by the same rule, which ever the context. This 
only depends on the polarity^ of A as a subformula. It allows to do the allocation 
of space for subformulae before the proof-search starts, and not dynamically. 

In section 3.3, we sketched an algorithm to effectively build a counter-model 
when the proof-search fails, like what was already done in [10]. The proof can be 
very easily adapted to the sharing-trees version of the system. In fact, one can 
effectively build Kripke-trees that are strong counter models (see [10]) in case of 
the failure of the proof-search algorithm on sharing trees. 

Theorem 8. The algorithm can produce proofs or counter-models and their 
depth is hounded by the size of the sequent to prove. 

5 Implementation Issues 

The sharing tree version of the SLJ system is well suited for an efficient imple- 
mentation in imperative languages. A proof-search algorithm requires a bounded 
memory space (depending on the size of the initial sequent) and it can be imple- 
mented without using dynamic memory allocation at all. Moreover, good choices 
are not obvious, especially in terms of time, if we aim to have a light use of re- 
sources. To define a good proof-search strategy is not an easy task and can 
depend on the kind of formulae. Its implementation could be difficult because 
it could involve too much administrative tasks like maintaining lists of formulae 
or even traversing lists at each point in the proof-search. A strategy has to take 
into account that some rules have the invertibility property. Moreover, in case 
the algorithm has to return counter- models, the strategies are restricted to those 
respecting the order in the completeness proof of section 3.3. 

For a given strategy, the program having a sequent (set of sharing trees) as input 
has to choose a leaf of the tree. Going through the tree to search for a particular 
kind of leaf is not a good solution and it is necessary to gather leaves by types. 
But this information has to be maintained when the proof rules are applied. To 
summarise, the basic resources are the sequent represented as a tagged tree of 
formulae and the conclusion (or starred formula if we have a boxed sequent). This 
information is completed with some administrative information so as to make 
these resources accessible in minimal time. The key-points are the following: the 

This is the parity of the number of times a subformula occurs on the left of 



3 
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leaves should be organised by types to have an efficient choice. It is especially 
important to avoid the traversal of leaves at each point of the proof-search; the 
application of rules should not involve too much administrative tasks otherwise 
the benefits of the added information would be null; the reverse application of 
rules is also concerned by this point because the non-invertibility of some rules 
implies some backtracking. 

6 Conclusion 

In this paper, we have presented a new system SLJ derived from ideas of LJT [1] 
and LG [3] which is proved sound and complete for IPL. It is an intermediate 
step towards an efficient implementation of IPL based on the concept of sharing 
trees. This system has very nice properties like the subformula property, a linear 
bound on the depth of proof-search and the ability to generate Kripke counter- 
models. In addition to the theoretical complexity results, our approach provides 
an effective and fine control of space and time resources, adapted to the design 
in imperative languages. A first implementation has been developed in C and 
the practical feasibility of the method has been confirmed. A possible extension 
of this work would be, as in [10], implementing the intuitionistic proof of the 
system based on sharing trees and mechanically extracting a (proved) algorithm 
from this proof. 
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Abstract. Hoare logic can be used to verify properties of deterministic 
programs by deriving correctness formulae, also called Hoare triples. The 
goal of this paper is to extend the Hoare logic to be able to deal with 
probabilistic programs. To this end a generic non-uniform language £pw 
with a probabilistic choice operator is introduced and a denotational 
semantics V is given for the language. A notion of probabilistic predicate 
is defined to express claims about the state of a probabilistic program. To 
reason about the probabilistic predicates a derivation system pH, similar 
to that of standard Hoare logic, is given. The derivation system is shown 
to be correct with respect to the semantics V. Some basic examples 
illustrate the use of the system. 



1 Introduction 

Probability is introduced into the description of computer systems to model the 
inherent probabilistic behaviour of processes like, for example, a faulty commu- 
nication medium. Probability is also explicitly introduced to obtain randomized 
algorithms to solve problems which can not be solved efficiently, or can not be 
solved at all, by deterministic algorithms. With increasing complexity of com- 
puter programs and systems, formal verification has become an important tool 
in the design. The presence of probabilistic elements in a program usually makes 
understanding and testing of the program more difficult. A way of formally ver- 
ifying programs becomes even more important. 

To formally verify a probabilistic program, the semantics of the program is 
given. The mathematical model of the program that is obtained in this way can 
be used to directly check properties of the program. In the probabilistic analyses 
of the model, results from probability theory are used to obtain e.g. average 
performance or bounds on error probabilities [18, 15]. Models that are often 
used are Markov chains and Markov decision processes [11,4] probabilistic input- 
output automata [19, 20] and probabilistic transition systems [9, 8], sometimes 
augmented with probabilistic bisimulation [14, 10, 2]. 

For some programs the construction of the mathematical model can already 
be difficult. A systematic approach to simplify the program, or obtain properties 
without having to actually calculate the semantics are useful. Approaches in this 
area are probabilistic process algebra [2] and stochastic process algebra [6] where 
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equivalences of programs can be checked syntactically by equational reasoning. 
Another approach is to introduce a logic to reason about the probabilistic pro- 
grams. Here the later approach is followed by extending Hoare logic as known 
for deterministic programs to probabilistic programs. Other work on probabilis- 
tic logic can be found in e.g. [13, 4, 16, 17, 6]. In [13] an algebraic version of 
propositional dynamic logic is introduced. In [4] probabilities for temporal logic 
formulae are calculated. In [16] a weakest precondition calculus based on ‘ex- 
pectations’ is defined and in [17] a notion of probabilistic predicate transformer, 
also used to find weakest preconditions, is given. Model checking is used in [1] 
to check formulae in a probabilistic temporal logic. 

Deterministic Hoare logic is a system to derive correctness formulae, also 
called Hoare triples. A formula {p} s {q} states that the predicate p is a 
sufficient precondition for the program s to guarantee that predicate q is true 
after termination. An extensive treatment of Hoare logic can be found in [5]. 
What the values of the variables in a program, i.e. the (deterministic) state of 
the program, will be, can not be fully determined if the program is probabilistic. 
Only the probability of being in a certain state can be given. This gives the 
notion of a probabilistic state. In a probabilistic state, a deterministic predicate 
will no longer be true or false, it is true with a certain probability. This can be 
dealt with by changing the interpretation of validity of a predicate to a function 
to [0,1] instead of to {true^ false} as in [13, 16]. The approach chosen here 
instead is to extend the syntax of predicates to allow making claims about the 
probability that a certain deterministic predicate holds. The extended form of 
predicates are called probabilistic predicates. A logic for probabilistic programs 
should reason with these probabilistic predicates. 

In section 2 some mathematical definitions are given. The syntax of the 
non-uniform language £pw is given in section 3 together with its semantics. In 
section 4 probabilistic predicates are defined and a Hoare like logic is introduced 
to reason about probabilistic predicates. The logic is shown to be correct with 
respect to the denotational semantics. Some examples of the use of the logic are 
given in section 5 and some concluding remarks are given in section 6. 

2 Mathematical Preliminaries 

A complete partially ordered set (cpo) is a set with partial order < that has 
a least element and for which each ascending chain has a least upper bound 
within the set. An order on Y is extended point wise to functions from A to T 
{f^g:X^Y then f < g if f{x) < g{x) for all x G A). 

The support of a function / : A ^ [0, 1] is defined as the x G A for which 
f{x) ^ 0. The set of all functions from A to [0, 1] with countable support is 
denoted by A ^cs [OM]- Given a function / : A ^cs [OM] a set T C A 
the sum f[Y] = f(y) well-defined (allowing the value oo). The set 

of (pseudo) probabilistic measures Ad (A) on a set A is defined as the subset of 
functions in A [0, 1] with sum at most 1. 

Ad(A) = {/GA^c. [0,1] IE/[^]<1}- 
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For a measure / G M{X)^ f{x) (for x G is interpreted as the probability 
that X occurs. The set A4{X) is a cpo, with minimal element 0, the function that 
assigns 0 to each element of X. For each ascending sequence in M{X) the limit 
exists within M{X) and corresponds to the least upper bound of the sequence. 

For element x G X and y ^ Y and a function / : X V, f[x/y]^ called a 
variant of /, is defined by 

f\x/v](x') = ly = 

^ \f{x') otherwise. 

3 Syntax and Semantics of 

The language £pw is a basic programming language with an extra operator used 
to denote probabilistic choice. A typical variable is denoted by t?, the set of all 
variables by Var. The types of the variables are not made explicit. Instead a 
set of values Val is fixed as the range for all variables. (The examples deviate 
from this assumption and use integer and boolean variables.) Types can easily 
be added at the cost of complicating notation with less important details. 

Definition 1. The statements in Cpw, ranged over by s, are given by: 

s ::= skip \ v := e \ s; s \ s (Br s \ if c then s else s h \ while c do s od ^ 

where c e BC is a boolean eondition, e G Exp is an expression over values in 
Var and variables in Var and r is a ratio in the open interval (0, 1). 

The statements are interpreted as follows. The statement s 0^ s' makes a prob- 
abilistic choice. With probability r the statement s will be executed, and s' 
will be executed with probability 1 — r. The other constructs of £pw are well 
known. The skip statement does nothing. Assignment v := e assigns the value 
of the expression e to the variable v. Sequential composition s; s' is executed by 
first executing s, then executing s' . The if c then s else s' G statement executes 
s if the condition c holds, and otherwise s'. Finally, while c do s od repeatedly 
executes s until condition c no longer holds. 

The internal details of the boolean conditions {BC) and expressions {Exp) 
are abstracted away from. Instead of defining an explicit syntax for the boolean 
conditions and the expressions, it is assumed that given the value of the variables, 
they can be evaluated. This is made more precise below. 

For a deterministic program, the state of the computation is given by the 
value of the variables. The state space S for a deterministic program consists of 
S = Var Val. For a probabilistic program, the values of the variables are no 
longer determined. For example, after executing x := 0 0 1 x := 1, the value of x 
could be zero but it could also be one. Instead of giving the value of a variable, a 
distribution over possible variables should be given. A first idea may be to take 
as a state space Var Xi(Val). This does give, for each variable t?, the chance 
that V takes a certain value but it does not describe the possible dependencies 
between the variables. 
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Consider the following example. In the left situation, a fair coin is thrown and 
a second coin is put beside it with the same side up. In the right situation, two 
fair coins are thrown. The two situations are indistinguishable if the dependency 
between the two coins is not known; the probability of heads or tails is ^ for 
both coins in both situations. The difference between the situations is important 
e.g. if the next step is comparing the coins. In the first situation the coins are 
always equal. In the second situation they are equal with probability | only. 



coin 2 





coin 1 




heads tails 


heads 


i 0 


tails 


0 i 



coin 2 





coin 1 




heads tails 


heads 


1 1 

4 4 


tails 


1 1 

4 4 



The more general state space U = M{Var Val) is required. In 6^ G il, instead 
of giving the distributions for the variables separately, the probability of being 
in a certain deterministic state is given. The chance that a variable v takes value 
w can be found by summing the probabilities of all states which assign w to v. 



Definition 2. 



(a) The set of deterministic states S, ranged over by a, is given by S = Var^ Val. 

(b) The evaluation functions V : Exp^S^Val and B : BC^S^{true, false} are 
the functions that compute the value of expressions and boolean conditions. 

(c) The set of (pseudo) probabilistic states II, ranged over by 0, is given by 

n = M{S). 

(d) On n the following operations are defined: 

0\ ^2 = r • 0i ^ {I - r) • O 2 , 

0 / 1 / N f c true in a i.e. B(c)(a) = true, 

‘^'><■’>={0 oae,-mse. 

e[v/V{e)\[a) = X 0[{a' \ a'[v /V{e){a')] = cr }] . 

where + is standard addition of functions and r* is scalar multiplication. 

Note that 6^ G 77 is a function from S to [0, 1]. The value 0{a) returned by 0 is 
the probability of being in the deterministic state a. 

The functions V and B are assumed given. The syntactic details of expressions 
and conditions as well as the precise definitions of these functions are abstracted 
away from. In a probabilistic state the values of the variables are, in general, not 
known and the value of expressions and conditions can not be found. Evaluation 
of expressions and conditions can only be done in a deterministic state. To find 
the probability of being in a state a if in 6^ the expression e is assigned to variable 
V, the probabilities of all states a' that yield a after changing the value for v to 
that of e (evaluated in a') have to be added. 

The denotational semantics V for £pw gives, for each statement s, and state 
0, the state V{s){0) resulting from executing s starting in state 0. 
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Definition 3. 

(a) The higher-order operator ^(c,s> {TI ^ II) ^ (77 ^ II) is given by 

^(c,s)mO)=i^{V{s){cW))+^cW. 

(b) The denotational semantics V : Cpw ^ {II ^ II) is given by 

V{skip){0) = 0 
V{v:=e){0) = 0[v/V{e)] 
v{s-,s'){e) = v{s'){v{s){e)) 

V{S s'){0) = V{s){d) 'D{s'){d) 

V{if c then s else s' G){0) = V{s){c70) + V{s'){^c70) 

V{while c do s od) = the least fixed point ofI^c,s) • 

For a while statement while c do s od, one would like to use the familiar un- 
folding to if c then s; while c do s od else skip h. This can not be done directly, 
as the second statement is more complex than the first. Instead we can use the 
fact that V{while c do s od) is a fixed point of the higher-order operator I{c,s) 
to show that 

V{while c do s od){0) = I(^c,s){H'{^GGe c do s od)){0) 

= V{while c do s od){V{s){c70)) -h ^c70 
= V{if c then s; while c do s od else skip h)(0) . 

Note that the total probability of V{while c do s od){0) may be less than that 
of 0. The ‘missing’ probability is the probability of non-termination. 

The least fixed point of I{c,s) can be contructed explicitly. 

Definition 4. For a statement s define = s and = s^s'^. The functions 
'^f{c,s) from probabilistic states to probabilistic states are given by 

^/(cs>(^) ~ F{{if c then s else skip fi)'^){0) 
frc,sm= hm ^c7zf^l,^{0), 

' ' n^oo ' ’ ' 



Lemma 1. The least fixed point given by 7^c,s>* 

The function if(c,s> is merely a shorthand notation. The function 7^c,s> charac- 
terizes the least fixed point of T{c,s) is thus equal to T> {while c do s od). 

4 Probabilistic Predicates and Hoare Logic 

The deterministic predicates used with deterministic Hoare logic are first order 
predicate formulae. Here dp is used to denote a deterministic predicate. The 
usual notions of fulfillment, a |= dp i.e. dp holds in a, and substitution, dpfy/e]. 
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for deterministic predicates are assumed to be known. An important property 
of substitution is that a j= dp[v/e] exactly when o-[v /V{e){a)] |= dp. (Replacing 
the variable by an expression in the predicate is the opposite of assigning the 
value of the expression to the variable in the state.) 

A deterministic Hoare triple, or correctness formula, {dp} s {dp’}, de- 
scribes that dp is a pre condition and dp’ is a post condition of program s. The 
Hoare triple is said to be correct if execution s in any state that satisfies dp 
will lead to a state satisfying dp’. To extend the Hoare triples to probabilistic 
programs, a notion of probabilistic predicate has to be introduced. One option 
is to use the same predicates as for deterministic programs but to change the 
interpretation of a predicate. A deterministic predicate can be seen as a function 
from states to {0, 1 }, returning 1 if the state satisfies the predicate and 0 oth- 
erwise. The predicates can be made probabilistic by making them into functions 
to [0, 1], returning the probability that the predicate is satisfied in a probabilistic 
state (See e.g. [13, 17]). This approach, however, does not allow making claims 
about the probability within the predicate itself, only the value of the predicate 
gives information about the probabilities. A property like “dp holds with proba- 
bility ce” can not be expressed as a predicate. Also the normal logical operators 
like A have to be extended to work on [0,1]. 

In this paper probabilistic predicates can only have a truth value i.e. true 
or false. Probabilistic predicates are predicates in the usual sense, but with an 
extended syntax to express claims about probabilities. The construct P(dp) -< ce, 
for {<,<,=,>,>}, is the basis for probabilistic predicates. Here dp is any 
deterministic predicate and ce is an expression, not using program variables, 
evaluating to a number in [0,1]. The predicate P(dp) = ce holds in a state 0 
if the chance in 0 of being in a deterministic state that satisfies dp is equal to 
ce. Similar for the other choices for Probabilistic predicates can be combined 
by the logical operators from predicate logic. For example, assuming that Val = 
{1,2,... }, Vi : P(c = i) = is a valid predicate stating that v has a geometric 

distribution. The expression uses the logical variable i, but does not depend 
on program variables like v. Furthermore for probabilitic predicates p and p', 
p + p', r -p and c?p are also probabilistic predicates. Their interpretation is given 
below. 

Definition 5. 

(a) A probabilistic predicate is a basic probabilistic predicate of the form P(dp) -< 
ce (-<e {<,<,=,>,> }y) or a composition of probabilistic predicates with one 
of the logic operators V, A, 3, V, or one of the operators +, r*, c?. 
Probabilistic predicates are ranged over by p and q. The following shorthand 
notations are also used 

p(BrP' = r • p -h (1 — r) • p', 

[dp] = P(dp) = l, 
not c = P(c) = 0 . 
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(b) The probabilistic predicates are interpreted as follows. 

0 1= P(dp) -< ce when X^fj|=dp^(^) ^ 

^ 1= Pi +P 2 when there exists 6 ^i, 6^2 with 6 ^ = 6 ^ 1 + 6 ^ 2 ; H Pi ^2 |= P 2 , 

0 r ' p when there exists 0' such that 0 = r ^ 6' and 0' \= p, 

0 1 = clp when there exists 0' for which Q = cl Q' and 0' |= p. 

For the logical connectives the interpretation is as usual. 

(c) Substitution on probabilistic predicates [v/e] is passed down through all con- 
structs until a deterministic predicate is reached. 

(P(dp) -< ce)fy/e] = P(dpfy/e]) -< ce, 

(pop p')b/e] =pb/e] op p'[v/e] op & { A, V, ©r }, 
(opp)[w/e] = op (p[w/e]) op € {-.,3,V,r • } 

(c?p)[c/e] = c[v ! e\l {p[v ! e\) . 

Note that the extension of deterministic predicates to [0,l]-valued functions is 
more or less incorporated within the probabilistic predicates as used in this 
paper. To check the probability of a certain deterministic predicate dp in state 
6 ^, look for which r the predicate P(dp) = r is true in 0 instead of checking the 
value of dp in 0 is r. 

When reasoning about probabilistic predicates, caution is advised. Some 
equivalences which may seem true at first sight do not hold. The most important 
of these is that in general p P ^ P- Take for example P(x = 1) = 1 V F{x = 
2) = 1 for p and a state satisfying F{x = 1) = ^ + F{x = 2) = ^ will satisfy 
p 0 1 p but not p. Other examples are p = 3i : q[i] and p = \fi : {q[i] W q'[i]) . The 
equivalence does hold for the basic predicates P(dp) -< r and if the equivalence 
holds when p = q and when p = q' then it also holds for p = g A 

Using probabilistic predicates the Hoare-triples as introduced for deterministic 
programs can be extended to probabilistic programs. Hoare triple {p} s {q} 
indicates that p is a pre condition and g is a post condition for the probabilistic 
program s. The Hoare triple is said to hold, denoted by fy {p}' 5 {g},if the 
pre condition p guarantees that post condition g holds after execution of s. 

1= {P} s {q} if y0en:0\=p^V{s){0) \= q. 

For example |= {p} skip {p} and fy {P(x = 0) = l}x:=x + l {P(x = 1) = 1}. 

To prove the validity of Hoare triples, a derivation system called pH is intro- 
duced. The derivation system consists of the axioms and rules as given below. 



{p} skip {p} 


(Skip) 


{p}s{q} {p}s'{q'} 

{p} s®cs' {q®cq} 


(Prob) 


{p[v/e] }v:=e{p} 


(Assign) 


{clp} s {q} { -nclp }«'{?'} 


(If) 


{ p } if c then s else s' h {q-\- q } 




(Seq) 


p invariant for {c, s) 


(While) 


{p} s; s' {q} 


{ p } while c do s od { p A not c } 
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{p\j]} S {q} j^p,q 

{ 3i : p[i] } s { g } 



(Exists) 



p' ^p {p}g{g} q^q' 
{P'}s{q'} 



(Imp) 



{p}g{g[i]} j^p,q 
{p} s {Vi : q[i] } 



(Forall) 



{p}s{g} {p'}s{q} 

{pVp}s{q} 



(Or) 



The rules (Skip), (Assign), (Seq) and (Cons) are as within standard Hoare 
logic but now dealing with probabilistic predicates. The rules (If) and (While) 
have changed and the rules (Prob), (Or), (Exists) and (Forall) are new. 

For p to hold after the execution of skip, it should hold before the execution 
since skip does nothing. The predicate p holds after an assignment v := e exactly 
when p with e substituted for v holds before the assignment, as the effect of the 
assignment is exactly replacing v with the value of e. The rule (Seq) states that 
p is a sufficient pre condition for q to hold after execution of s; s' if there exists 
an intermediate predicate p' which holds after the execution of s and which 
implies that q holds after the execution of s'. The rule (Cons) states that the 
pre condition may be strengthened and the post condition may be weakened. 

The rule (Prob) states that the result of executing s' is obtained by com- 
bining the results obtained by executing s and s' with the appropriate probabil- 
ities. The necessity for the (Or), (Exists) and (Forall) rules becomes clear when 
one recalls that p0rP 7 ^ p. Proving correctness of { pVg } skip^rskip { pVg } is, 
in general, not possible without the (Or) -rule. Similar examples show the need 
for the (Exists) and (Forall) rule. Note the similarity with the natural deduction 
rules for V and 3 elimination and V introduction. 

The rule (If) has changed with respect to the (If) rule of standard Hoare logic. 
In a probabilistic state the value of the boolean condition c is not determined. 
Therefore the probabilistic state is split into two parts, a part in which c is 
true and a part in which c is false. After splitting the state, the effect of the 
corresponding statement, either s or s' ^ can be found after which the parts are 
recombined using the 0 operator. 

To use the (While) rule, an invariant p should be found. For p to be an invari- 
ant, it should satisfy {p} if c then s else skip B {p} . This condition is sufficient 
to obtain partial correctness. If the program s terminates and {p} s {q} can 
be derived from pH, then |= {p} s {q} . A probabilistic program is said to 
terminate, if the program is sure to terminate when all probabilistic choices are 
interpreted as non-deterministic choices, i.e if the program terminates for all pos- 
sible outcomes of the probabilistic choices. Partial correctness, however, is not 
sufficient for probabilistic programs. Many probabilistic programs do not satisfy 
the termination condition, they may for instance only terminate with a certain 
probability. (Note that, even if that probability is one, the termination condition 
need not be satisfied.) To derive valid Hoare triples for programs that need not 
terminate, a form of total correctness is required. This requires somehow adding 
termination conditions to the rules. To obtain total correctness we strengthen 
the notion of invariant by imposing the extra condition of (c, s)-closedness. 
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Definition 6. 

(a) For a predicate p the n-step termination ratio, denoted by r^^ is the proba- 
bility that, starting from a state satisfying p, the while loop ^^while c do s 
terminates within n steps. 

(b) A sequence of states called a {c, s)- sequence if {^c?0n)neN is an 

ascending sequence with ^^c?0n[S] > '^^cs)’ 

(c) A predicate p is called (c, s) -closed if each (c, s) -sequence within p has a limit 
(least upper bound) within p. 

p invariant for (c, s) when 

{p} if c then s else skip h {p} and p is (c, s) -closed. 

Note that for a loop while c do s od that terminates every p automatically sat- 
isfies (c, s)-closedness. Therefore, for a terminating program, there is no need to 
check any (c, s)-closedness conditions. 

A Hoare triple {p} s {q} is said to be derivable from the system pH, 
denoted byh {p} s {q} , if there exists a proof tree for {p} s {q} in pH. The 
derivation system is correct, i.e. only valid Hoare triples can be derived from pH. 



Lemma 2. The derivation system pH is correct, i.e. for all predicates p and q 

and statements s, h {p} s {q} implies fy {p} s {q} • 

Proof. It is sufficient to show that ifO\=p and h {p} s {q} then V{s){0) G q. 

This is shown by induction on the depth of the derivation tree for {p} s {q} , 

by looking at the last rule used. A few cases are given below. 

• If the rule (Exists) was used and 0 \= 3i : p[i] then there is an io for which 
0 1= p[io\- By induction fy {p[j] } ^ {l} which gives, by substituting the value 
io for the free variable j {p[io] } s {q} • But then V{s){0) \= q. 

• Known from the non-probabilistic case is that c'‘[^/V(e)(cr)] fy dp exactly when 
a 1= dp[v/e]. By induction on the structure of the probabilistic predicate p this 
extends to 0[vlV{e)] \= p exactly when 0 fy p[v/e]. Correctness of the (Assign) 
rule follows directly. 

• If rule (Prob) is used to derive h {p} {q ^ q' } from h {p} s {q} 

and h {p} s' {q' } then by induction |= {p} s {q} and |= {p} s' {q' } • 
This means that if 0 \= p then V{s){0) \= q and V{s'){0) |= q' . But then 
V{s(B s'){0) =V{s){0)(BrP{s'){0) ^ q(Brl'- The case for rule (If) is similar. 

• Assume rule (While) is used with statement s, condition c and invariant p. 

Clearly V{while c do s od){0) |= not c and |= {p} if c then s else skip h {p} 
can be used repeatedly to gives that if 0 \= p then (^/(c,s>(^))nGN ^ 

sequence. By (c, s) -closedness V{while c do s od){0) = T^c,s> H P- 
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5 Examples 

The picture below gives an example of a proof tree in the system pH. For larger 
programs, instead of giving the proof tree, a proof outline is used. In a proof 
outline the rules (Imp) and (Seq) are implicitly used by writing predicates be- 
tween the statements and some basic steps are skipped. A predicates between 
the statements give conditions that the intermediate states in the computation 
must satisfy. 

(Assign) (Assign) 

{[x 1 = 2]} X := X 1 {[x = 2]} {[x 2 = 3]} X := X 2 {[x = 3]} 

(Imp) (Imp) 

{[x = 1]} X := X -\- 1 {[x = 2]} {[x = 1]} X := X -\- 2 {[x = 3]} 

(Prob) 

{ [x = 1] } X := X + 1 0 1 X := X + 2 { [x = 2] 0 i [x := 3] } 

(Imp) 

{[x = l]}x:=x + l0ix:=x + 2 { P(x = 2) = | A P(x = 3) = | } 

The following program adds an array of numbers, but some elements may in- 
advertently get skipped. A lower bound on the probability that the answer will 
still be correct is derived. An n-ary version of V is used as a shorthand. 

int ss[l . . . A^], r, k; 

{ [true] }^ { P(0 = 0, 1 = 1) = 1 } 
t = 0; k = 1; 

{P(t = 0,/c = 1) = 1} ^ 

{P(fc = Af + l,t = Ef=iSs[*])>r^ V V^=oP(fc = n,t = Et0s[*]) 

while {k < N) do 

{ V^=oP(fc = n,t = Eti ««[*]) > r"- A ^ 

{ V^=oP(fc = n,t + ss[fc] = J2i=i ««[*]) > } 

t := t -\- ss[/c] 0r skip; 

{ \/n=o^{k = n,t = J2i=i -ss W) > 0r true } => 

{ vZtlnk + l = m,t = Eti > r— 1 } 

k := k -\- 1 

{ <t\nk = m,t = Eti *«[*]) > } 

od 

{ V^Ao^P(/c = n,t = } A not {k < N) ^ 

In the following example, a coin is tossed until heads is thrown. The number of 

required throws is shown to be geometrically distributed. For ease of notation 
the following shorthand notations are used. 

P = goo V : q[i] 

goo = Vg > 0 : P(x = g, done = true) = 

q[i] = P(x = i, done = false) = A Vg G {1, . . . , i} : P(x = g, done = true) = 



13 

2 
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Then, assuming p is an invariant: 

{ [ti’ue] } 

bool done = false; int n = 0; 

{ P(n = 1, done = false) = 1 { p } 

while not done do x := x 1; done = true 0 i skip od 

{p A done }=> { Vn > 0 : P(x = n) = } 

To show that p is an invariant proof the rule (Or) is used to split the proof into 
two parts, the first of which is trivial. For the second part the rule (Exists) is 
used to give: 

{#]} 

while not done do 

{ {not done)?q[k] } { P(x = i, done = false) = |* } 

X := X -\- 1; 

{ ¥{x = i + 1, done = false) = |* } 
done = true 0 i skip 

{ P(x = i + 1, done = false) = + P(x = i + 1, done = true) = } 

od 

{ P(x = i + 1, done = false) = + P(x = i + 1, done = true) = + 

Vi € { 1, . . . , i }: P(x = j, done = true) = 5'’ } => { + 1] } => { P }• 

The requirement that p is {not done, x := x 1 ; done = true 0 1 sA:ip) -closed is 
easy to check but requires the presence of the Qoo term. 

6 Conclusions and Further Work 

The main result of this paper is the introduction of a Hoare like logic, called 
pH, for reasoning about probabilistic programs. The programs are written in a 
language £pw and their meaning is given by the denotational semantics V. 

The probabilistic predicates used in the logic retain their usual truth value 
interpretation, i.e. they can be interpreted as true or false. Deterministic predi- 
cates can be extended to arithmetical functions yielding the probability that the 
predicate holds as done in e.g. [13] and [17]. This extension is incorporated by 
using the notation P(dp) to refer to exactly that, the chance that deterministic 
predicate dp holds. The chance of dp holding can then be exactly expressed or 
lower and/or upper bounds can be given within a probabilistic predicate. The 
main advantage of keeping the interpretation as truth values is that the logical 
operators do not have to be extended. 

The logic pH is show correct with respect to the semantics V. For an (earlier) 
infinite version of the logic a completeness result exists. For the current logic 
the question of completeness is still open. Especially the expressiveness of the 
probabilistic predicates has to studied further. 

To be able to describe distributed randomized algorithms, it would also be 
interesting to extend the language and the logic with parallelism. However, verifi- 
cation of concurrent systems in general and extending Hoare logic to concurrent 
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systems in specific (see e.g. [3, 7]) is already difficult in the non-probabilistic 
case. 

To make the logic practically useful, the process of checking the derivation of 
a Hoare-triple should be automated. Some work has been done to embed the logic 
in the proof verification system PVS. (See e.g. [12] on non-probabilistic Hoare 
logic in PVS.) The system PVS can then be used both to check the applications 
of the rules and to check the derivation of the implications between predicates 
required for the (Imp) rule. By modeling probabilistic states, PVS could perhaps 
also be used to verify the correctness of the logic, however this would require a 
lot of work on modeling infinite sums. 
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Abstract. A temporal logic of causality (TLC) was introduced by Alur, 
Penczek and Peled in [1]. It is basically a linear time temporal logic 
interpreted over Mazurkiewicz traces which allows quantification over 
causal chains. Through this device one can directly formulate causality 
properties of distributed systems. In this paper we consider an extension 
of TLC by strengthening the chain quantification operators. We show 
that our logic TLC^ adds to the expressive power of TLC. We do so by 
defining an Ehrenfeucht-Fraisse game to capture the expressive power of 
TLC. We then exhibit a property and by means of this game prove that 
the chosen property is not definable in TLC. We then show that the same 
property is definable in TLC^. We prove in fact the stronger result that 
TLC^ is expressively stronger than TLC exactly when the dependency 
relation associated with the underlying trace alphabet is not transitive. 



1 Introduction 

One traditional approach to automatic program verification is model checking 
LTL [11] specifications. In this context, the model checking problem is to decide 
whether or not all computation sequences of the system at hand satisfy the 
required properties formulated as an assertion of LTL. Several software packages 
exploiting the rich theory of LTL are now available to carry out the automated 
verification task for quite large finite-state systems. 

Usually computations of a distributed system will constitute interleavings 
of the occurrences of causally independent actions. Often, the computation se- 
quences can be naturally grouped together into equivalence classes of sequences 
corresponding to different interleavings of the same partially ordered computa- 
tion stretch. For a large class of interesting properties expressed by linear time 
temporal logics, it turns out that either all members of an equivalence class sat- 
isfy a certain property or none do. For such properties the verification task can 
be substantially improved by the partial-order methods for verification [6,10]. 

Such equivalence classes can be canonically represented by restricted la- 
belled partial orders known as Mazurkiewicz traces [4,8]. These objects allow 

Part of this work was done at Lehrstuhl fiir Informatik VII, RWTH Aachen, Germany 
Basic Research in Computer Science, 

Centre of the Danish National Research Foundation. 



P.S. Thiagarajan, R. Yap (Eds.): ASIAN’99, LNCS 1742, pp. 126-138, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 




An Expressive Extension of TLC 



127 



direct formulations of properties expressing concurrency and causality. A num- 
ber of linear time temporal logics to be interpreted directly over Mazurkiewicz 
traces (e.g. [1,3,9,12,13,14,15]) has been proposed in the literature starting with 
TrPTL [13]. 

Among these, we consider here a temporal logic of causality (TLC) introduced 
in [1] to express serializability (of partially ordered computations) in a direct 
fashion. The operators of TLC are essentially the branching-time operators of 
CTL [2] interpreted over causal chains of traces. However, the expressive power 
of this logic has remained an interesting open problem. Indeed, not much is 
known about the relative expressive powers of the various temporal logics over 
traces. 

What is known is that a linear time temporal LTrL, patterned after LTL, 
was introduced [15] and proven expressively equivalent to the first-order theory 
of traces. LTrL has a simple and natural formulation with very restricted past 
operators, but was shown non-element ary in [16]. Recently, it was shown that the 
restricted past operators of LTrL can be replaced by certain new future operators 
while maintaining expressive completeness. In other work, Niebert introduced a 
fixed point based linear time temporal logic [9] . This logic has an elementary-time 
decision procesure and is equal in expressive power to the monadic second-order 
theory of traces. 

However, the expressive powers of most other logics put forth (e.g. [1,12,13]) 
still have an unresolved relationship to each other and, in particular, to first-order 
logic. Most notably, it is still a challenging open problem whether or not TrPTL 
or TLC is expressively weaker than first-order logic. With virtually no other 
seperation result known, this paper is a contribution towards understanding the 
relative expressive power of such logics. 

A weakness of TLC is that it doesn’t facilitate direct reasoning about causal 
relationships between the individual events on the causal chains. In this paper 
we remedy this deficiency and extend TLC by strengthening quantification over 
causal chains. This extended logic, which we call TLC*, will enjoy a similarity to 
CTL* [2] that TLC has to CTL. The main result of this paper is that our exten- 
sion TLC* is expressively stronger than TLC for general trace alphabets whereas 
they express the same class of properties over trace alphabets with a transitive 
dependency relation. We prove this result with the aid of an Ehrenfeucht-Fra’isse 
game for traces that we develop. To our knowledge this is the first instance of 
the use of such games to obtain seperation results for temporal logics defined 
over partial orders. We believe that this approach is fruitful and that similar 
techniques may lead to other seperation results within this area. 

In the next section we briefly recall Mazurkiewicz traces and a few related 
notions. In Section 3 we introduce TLC and TLC*, the main objects of study 
in this paper. We give a very simple and natural example of a property easily 
captured in TLC* but not in TLC. In Section 4 we define an Ehrenfeucht- 
Frai'sse game and prove its correspondence to TLC. We use this correspondence 
in Section 5 to exhibit a property which we prove is undefinable in TLC. In 
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Section 6 we show that the said property can be defined within TLC*. Finally, 
we put all the pieces together to arrive at the main result. 

2 Preliminaries 

A (Mazurkiewicz) trace alphabet is a pair (A\i), where the alphabet, is a 
finite set and i C A x A is an irrefiexive and symmetric independence relation. 
Usually, S consists of the actions performed by a distributed system while / 
captures a static notion of causal independence between actions. For the rest of 
the section we fix a trace alphabet (U, i). We define D = {U x U) — I to be the 
dependency relation which is then reflexive and symmetric. 

Let T = (F\<,A) be a A-labelled poset. In other words, (F\ <) is a poset 
and X : E ^ U is 8i labelling function. For e G E we define le = {x G E \ x < e}. 
We also let < be the covering relation given by x < y iff x < y and for all x G 
X El z E: y implies x = z or z = y. Moreover, we let the concurrency relation be 
defined as x co y iff x ^ y and y ^ x. A Mazurkiewicz trace (over (A, i)) is then 
a A'-labelled poset T = (A, <, A) satisfying: 

(Tl) Ve G E. |e is a finite set 

(T2) Ve,e' G A. e < e' implies A(e) D A(e'). 

(T3) Ve, e' G E. A(e) D A(e') implies e < e' or e' < e. 

We shall let TR{L\I) denote the class of traces over (U,i). As usual, a trace 
language L is a subset of traces, i.e. L C TR{E^I). Throughout the paper we 
will not distinguish between isomorphic elements in TR(E^I). We will refer to 
members of E as events. It will be convenient to assume the existence of a 
unique least event J_ G A corresponding to a system initialization event carrying 
no label, i.e. A(T) is undefined and T < e for every e G A — {-L}. We will 
sometimes abuse notation and let a string in A'* denote its corresponding trace 
in (A\i) whenever no confusion arises. This is enforced by using conventional 
parentheses for string languages and square brackets for trace languages. 

In setting the scene for defining the semantics of formulas of TLC* we first 
introduce some notation for sequences. The length of a finite sequence p will be 
denoted by \p\. In case p is infinite we set \p\ = uo. Let p = (eo,ei, . . . , e^, . . .) 
and 0 < k < \p\. We set pk = {ck.ek^i , . . . ,6^, . . .)• 

Let T = (A, <,A) be a trace over (A\i). A future causal chain rooted at 
e G A is a (finite or infinite) sequence p = (cq, ei, . . . , e^, . . .) with e = cq, G A 
such that e^-i < for every i > 1. The labelling function A : A ^ AAs extended 
to causal chains in the obvious way by: A(p) = (A(eo)A(ei) • • • X{en) • • •). We say 
that a future causal chain p is maximal in case p is either infinite or it is finite 
and there exists no e' G A such that e\p\ < eJ . A past causal chain rooted at e e E 
is defined in the obvious manner. 

3 Syntax and Semantics 

In this section we will define the syntax and semantics of the temporal logics 
over traces to be considered in this paper. We start by introducing TLC* and 
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continue by giving an explicit definition of the sublogic TLC. We will not define 
first-order logic over traces (FO), but we refer the reader to e.g. [14,15]. 

TLC* consists of three different syntactic entities; event formulas fu- 
ture chain formulas and past chain formulas defined by mutual in- 

duction as described below: 



^ev 


•=Pa 1 




q;i V 0:2 


1 co(q;) I 1 E [ip), with a € E 




:= q; 


1 


V ^2 


XP> 1 <PU<k2- 


^ch 


:= q; ' 


- 7 /; 1 


< 

to 


1 

1 

to 



where o;, <j) and t/; (with or without subscripts) are formulas of <Pev^ 
respectively. The formulas of TLC*(A,i) are the set of event formulas <Pev as 
defined above^. 

The semantics of formulas of TLC* is divided into two parts; event formulas 
and chain formulas. Let T G TR(U^ I) and e G E. The notion of an event formula 
a being satified at an event e of T is defined inductively in the following manner. 

— l\e ^ Pa iff A(e) = a. 

— l\e \= iff l\e ^ a. 

— l\e \= ai\/ a 2 iff 'l\e |= a\ or l\e \= 0 : 2 . 

— l\e \= co(q;) iff there exists an e' G with e co e' and e' |= a. 

— T, e 1= E{(j)) iff there exists a future causal chain p rooted at e with T, p |= (j). 

— e 1= E~('ip) iff there exists a past causal chain p rooted at e with p |= 7 /;. 

As usual, tt = Pa V r^Pa and ff = ^tt. Suppose p = (cq, ci, . . . , e^, . . .) is a future 
causal chain. The notion of '1\ p |= (j) for a future chain formula (j) is defined 
inductively below. 

— '1\ p\= a Co 1= a. 

— ^^\p\= iS T,p ^ (t>. 

— T, /9 1= (^1 V (^2 iff T, p\= (f>ior: T, p \= <t)2- 

— 'J-\p\= '^ST,pi \= (j). 

— '1\ p ^ 4>\E 4>2 iff there exists a 0 < A: < |p| such that 'l\pk |= 4>2- Moreover, 
-f ? pm (j)i for each 0 < rn < k. 

The notion of l\p |= 7 /; for a past causal chain p and past chain formula 7 /; is 
defined in the straightforward manner. The well-known future chain operators 
are derived as E<j> = tW <j) and G<j> = r^Er^<j>. 

Suppose T G TR(E\I) and a G TLC*(A,i). Then T satisfies a iff 7’, T |= o;, 
denoted T |= a. The language defined hy a is: jC{a) = {T G TR(E\I) \ T |= 
a}. We say that L C TR(E^I) is definable in TLC* if there exists some a G 
TLC*(i7,/) such that C{a) = L. By slight abuse of notation, the class of trace 
languages over (T\i) definable in TLC* will also be denoted by TLC*(A,i). 

^ Another logic was in [1] termed “TLC*”, but as that logic denoted TLC interpreted 
over linearizations it is unrelated to our logic which seems naturally to earn the name 
“TLC*”. 
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The formulas of TLC(T\i) — introduced in [1] with a slightly different syn- 
tax — is then the set of formulas of TLC*(T\ I ) where each of the chain operators 
X, t/, G, U~ is immediately preceded by a chain quantifier E. As TLC will 
play a prominent role in this paper we will bring out its definition in more detail. 
More precisely, the set of formulas is given as: 

TLC(X, I) ::= pa \ | a V /? | EX{a) \ EU (a, f3) \ 

EG{a) I EX~{a) \ EU~{aJJ) \ co{a), 

where a G X. The semantics is inherited directly from TLC* in the obvious 
manner, so notions of definability etc. are carried over directly. It can be shown 
that our extension TLC* remains decidable. In work to appear we construct an 
elementary-time decision procedure for TLC* by means of Biichi automata. 

Hence while the formulas of TLC are basically the well-known operators of the 
branching-time logic CTL [2] augmented with symmetrical past operators and 
concurrency information, the operators of TLC* are basically the well-known 
operators of CTL* [2] similarly extended with past quantifiers in a restricted 
fashion as well as concurrency information. The crucial difference is that while 
CTL and CTL* are branching-time logics interpreted over Kripke structures, 
TLC and TLC* are linear time temporal logics on traces interpreted over the 
underlying Hasse diagrams of the partial orders. 

One of the weaknesses of TLC is that it doesn’t directly facilitate reasoning 
about causal relationships of the individual events of the causal chains at hand. 
As a consequence, a number of interesting properties are not (either easily or at 
all) expressible within TLC. Section 5 provides a formal proof of this claim, but 
we will in the following bring out another such property which is very natural. 

Suppose that a and b are actions representing the acquiring and releasing, re- 
spectively, of some resource. A relevant property of this system is then whether 
or not there exists some causal chain in the execution of the system — pre- 
sumably containing other system actions than {a, b} — such that the a’s and 
6’s alternate strictly until the task is perhaps eventually completed. Via the fu- 
ture chain formula (f)xy = Px ^ \/ Py)U {py)) we can easily express this 

property in TLC* by E{G{(j)ab A 4>ha))- The point is here that TLC* allows us 
to investigate each causal chain in mention by a causal chain formula, which is 
then confined to this very chain. This is not possible in TLC, as the existential 
quantifications interpreted at some fixed event of the chain would potentially 
consider all causal chains originating at this event — not just the one presently 
being investigated. 

We conclude with two important notions relating to TLC. Firstly, let o; be a 
formula of TLC(A,/). The operator depth of a is defined inductively as follows: 
od(pa) = 0, od(r^a) = od(o;), od(a \/ f3) = max(od(o;), od(/?)), od(EX{a)) = 
od{EG{a)) = od{EX~{a)) = od{co{a)) = 1 ^ od{a) and od{EU{a^f3)) = 
od(EU~ f3)) = 1 + max(od(o;), od(/?)). The set of formulas of operator depth 
k is denoted by OD{k). 

Given 7b, d\ G TR{E^ I) and events of 7'^ we define that (7 q, gq) =n C^i, ei) 
if for any formula a G TLC(A,7) with od(a) < n, 7b,eo |= o; iff 7i,ei |= o;. 
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i.e. both structures agree on all subformulas of operator depth at most n > 0. 
It is then not hard to see that (7o,eo) =o Cii,ei) iff cq and ei are identically 
labelled, i.e. either A(eo) = A(ei) or cq = ei = _L. 

4 An Ehrenfeucht-Fraisse Game for TLC 

In this section we will present an Ehrenfeucht-Fraisse game to capture the ex- 
pressive power of TLC. The game is played directly on the poset representa- 
tion of (finite or infinite) Mazurkiewicz traces and it is similar in spirit to the 
Ehrenfeucht-Fraisse game for LTL introduced by Etessami and Wilke [5]. We ex- 
tend their approach to the richer setting of traces by highlighting current causal 
chains in the until-based moves and adding past- and co-moves. 

The EF-TLC game is a game played between two persons, Spoiler and Pre- 
server, on a pair of traces (7b/ii). The game is played over k rounds starting 
from an initial game state (cq, ei) and after each round the current game state is 
a pair of events (cq, e'^) with e' G TV Each round starts with the game in some 
specific initial game state (eo,ei) and Spoiler chooses one of the moves defined 
below and the game proceeds accordingly: 

EX-Move: This move can only be played by Spoiler if there exists an Cq G Eq 
such that eo < Cq or there exists an e[ G E\ such that ei < e[. Spoiler then 
wins the game in case there either exists no Cq G such that cq < Cq or 
no e'l G such that ei < e[. Otherwise (in which case both cq and ei has 
<-successors) the game proceeds as follows: (1) Spoiler chooses i G {0,1}, 
and an event e' G Ei such that < e'. (2) Preserver responds by choosing 
an event G Ei-i such that ei_^ < e[_^. (3) The new game state is now 
(cq, e{). 

E'L-Move: (1) Spoiler chooses i G {0, 1}, and an event e' G Ei such that < e' 
and he highlights a future causal chain (e^ = /P, /I , . . . , = e'-) with n > 0. 

(2) Preserver responds by choosing an event e[_- G E±-i with ei_^ < e'^_- 
such that if = e' then ei-i = e[_-. Furthermore she highlights a future 
causal chain (ei_^ = /i_^, /i_^, • • • , with m > 0. (3) Spoiler now 

chooses one of the following two steps: (3a) Spoiler sets the game state to 
(cq, e{). (3h) Spoiler chooses an event fi-i G Preserver 

responds with an event fi G {/P, // . . . f(^} and the game continues in the 
state (/o,/i). 

EG-M.ove: (1) Spoiler chooses i G {0,1}, and highlights a maximal future 
causal chain (e^ = /P, with // G Ei and n > 0. (2) Pre- 
server responds by highlighting a maximal future causal chain (ei_^ = 
/i_^, /i_^, . . . , . . .) with // G Ei-i and m > 0. (3) Spoiler chooses 
an event /i_^ G {/i_^, fi-i • • • fl^i}- Preserver responds with an event fi G 
{fi: fi • • • fr} game continues in the state (/o, /i). 

co-Move: This move can only be played by Spoiler if there exists an Cq G Eq 
such that eo co Cq or there exists an e[ G E\ such that ei co e[. Spoiler 
then wins the game in case there either exists no Cq G Eq such that cq co Cq 
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or no e[ G Ei such that ei co e[. Otherwise (in which case both cq and ei 
have concurrent events) the game proceeds as follows: (1) Spoiler chooses 
i G {0,1}, and an event e' G Ei such that co e'- in Ti. (2) Preserver 
responds by choosing an event G E\-i such that e\-i co in T\-i. 
(3) The new game state is now (eQ,e{). 

There are analogous EX~- and EU~ -moves. Here and throughout we refer to 
the full version for more details [7]. 

In the 0-round game Spoiler wins if (7b, cq) and otherwise Pre- 
server wins. In the {k -h l)-round game Spoiler wins if (7b, cq) If if 

is the case that (7b, cq) =o (7i,ei), a round is played according to the above 
moves. This round either results in a win for Spoiler (e.g. by the EX -move) 
or a new game state (eQ,e{). In the latter case, a A:-round game is then played 
starting from the initial game state (eQ,e{). 

We say that Preserver has a winning strategy in the A:-round game on (7 b, cq) 
and (7i,ei), denoted (7b, cq) if she can win the A:-round game on 

the structures 7b and 7} starting in the initial game state (eo,ei) no matter 
which moves are performed by Spoiler. If not, we say that Spoiler has a winning 
strategy. We refer to [5] for basic intuitions about the game. 

Our interest in the game lies in the following fact. 

Proposition 1. For every k>0, (7b, cq) ^k {Ti,ei) tff {To,eo) =k (7i,ei). 

Proof. We prove that (7b, cq) ^k {Ei,ei) iff (7b, cq) =k (7i,ei) by induction on 
k. The base case where A: = 0 follows trivially from the definition. 

For the inductive step suppose that the claim is true for k. We first prove 
the direction from left to right. Suppose that (7b, eo)^A:+i ^i). Let o; G 
TLC(i7,7) with od{a) = k Pl. We must show that To,eo ^ o; iff Ti,ei |= a. 
It suffices to prove the statement when the top-level connective of o; is a chain- 
operator because by boolean combinations (7b, cq) and (7{, ei) would then agree 
on all formulas of operator depth A: + 1. We will only consider the case where 
the top-level chain-operator is EU. The other cases follow similarly. 

Suppose now a = EU f3'). Assume without loss of generality that 7b, cq |= 
q;, i.e. there exists a future causal chain = (/q , /q , . . . , f^) with cq = /q and 
fo — ^0 1= P for each 0 < j < n and 7b, Cq |= f3' . Hence we 

let Spoiler play the EU -move on 7b and make him highlight on 7b- Preserver 
now uses her winning strategy and highlights p^ = (/^ , /{^) with e\ = 

and = e{. Two subcases now arise. 

Assume first that Spoiler sets the new game state to (eQ,e{). As was 
chosen from Preserver’s winning strategy we have that (7b, eQ)^^(7i, e{) which 
by induction hypothesis implies that (To, eQ)=^(Ti, e{). Thus 7i,e{ |= fE . Now, 
assume that Spoiler instead picked an event /i on p^. By Preserver’s winning 
strategy she could pick an event /o on p^ (This is possible due to the requirement 
that if eo = Cq then e\ = e{). Again by the winning strategy we have that 
(7b, /i) and by induction hypothesis that 7i,/i |= /?. Hence 7i,/i |= 
EU (/?,/?'), which concludes this direction of the proof. 
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We prove the direction from right to left by contraposition, so suppose that 
(7b, Co) (7i,ei). We will then exhibit a formula o; G TLC(A,i) with 

od{a) = k + 1 such that cq |= ot but 7i, ei ^ o;. Again, we will only prove the 
case where Spoiler’s first move of his winning strategy is either the EU-vaoYe. 
The other cases either follows in analogous or easier manners. 

Suppose Spoiler plays the EU -laoYe on 7 q (without loss of generality), i.e. he 
chooses a future causal chain = (/q , /q , . . . , f^) with cq = /q and = Cq. It is 
not hard to show by induction that there are only a finite number of semantically 
inequivalent formulas a with od(a) < k and 7b,e \= a for any e G Eq. Hence, 
each formula = /\{a G OD{k) \ /q |= q;}a/\{^q; G OD{k) \ /q ^ is 

well-defined and equivalent to a formula of operator depth k for each 0 < j < n, 

so letting = /?q we have that o; = EU (\/ is a TLC- formula 
with od(a) = k -\~ I and by definition cq \= a. We will argue that 7i, ei ^ a. 

Suppose that 7i,ei |= a. Then there exists a future causal chain = 

(/i , At • • • , fD with ei = A° and A”" = e[ such that f[ \= Vo<j<n /^o 
each 0 < / < m and 7 \ , e[ |= fd^'^ . 

Assume first that Spoiler chooses to set the new game state to (eQ,e']^) by 
following his winning strategy. As 7i, |= fd^'^ it must be the case that for each 

7 G OD{k), 7b, |= 7 iff 7i,e'^ |= 7. By induction hypothesis (7b, e'^) 

which contradicts that Spoiler has a winning strategy because Preserver could 
initially have played p^ as above and continued according to (7b, eQ)^^(7i, e'^). 

Now assume that Spoiler instead by his winning strategy picks an event 
/i on p^. Then Ti,f\ \= for some 0 < j < n as Ti,f\ \= Vo<j<n/^o- 
Again by induction hypothesis we know that (7b, /o)^AC^i, /i) which again 
contradicts that Spoiler has a winning strategy because Preserver could respond 
by picking /q G Eq and continue from the game state ( /q , /i ) according to 

Hence Ti, ei ^ o; as required. □ 



5 An Undefinability Result 

In this section we will give an example of a natural property which we, by means 
of the game characterization of the previous section, will show is not definable 
in TLC. Let (T’,7) be a trace alphabet with {a,6,c} C S such that a D c and 
c D b but alb. Consider L = [a6ca6c]* C 

Lemma 2. L is not definable in TLC(i7,7). 

Proof. Let A: > 0 be given and consider Tq = [a6c]^^ and Ti = 

It suffices to show that (Tq , _L)^j^(T^, _L). By Proposition 1 it then follows that 
(7 'q , _L)=j^(7'^, _L). Suppose L would be definable by a TLC-formula a of operator 
depth n. In particular then (7 q^, E)=n{Tfi^ _L). However, by definition it must be 
the case that Tq e L and Tfi ^ L, contradicting that Tq and Tfi satisfy the same 
set of formulas of operator depth at most n. Hence, L cannot be expressed by any 
formula of TLC assuming (7 q , holds for any k > 0. The remainder 
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Fig. 1. Tq (top) and (bottom) on which the game is played. 



of the proof will be devoted to showing that it is the case that (7 q , 0)^fe('if , 0). 
To bring this out we need a few definitions. As depicted in Figure 1 the game is 
played on Tq and if consisting of Ak and 4A: + 1 copies of the trace factor [a6c], 
respectively. The section of for i G {0,1} is then defined to be the number 
of the enclosing [a6c] -factor in counting from left and starting with 1. We 
denote this number by sect(e^). In case = T we set sect(e^) = 0. Furthermore, 
we say that cq and e\ are position equivalent^ in case either (eo,ei) = (T,-L) 
or A(eo) = A(ei). From the definition of Tq and 1\ it follows that cq and e\ 
are position equivalent in case cq and e\ denote the same local positions in two 
(possibly distinct) sections of Tq and 7i, respectively. The unique event of section 
s >1 labelled with letter x G {a, 6, c} in if will be denoted ef^. For example, 
the fourth 6-labelled event of if is denoted Cq^. 

We will then show that Preserver has a strategy such that after k' < k 
rounds played on {if /if) with current game state (cq, ei), the following invari- 
ant holds: 

(i) Co and ei are position equivalent. 

(ii) sect(eo) = sect{ei) or sect(eo) = sect{ei) — 1. 

(iii) sect(eo) = sect{ei) implies sect(eo) < 2{k + A:'). 

(iv) sect(eo) = sect{ei) — 1 implies sect(eo) > 2{k — k') + 1. 

We prove that the invariant holds by induction on k' . It is trivial to observe, 
that in the base case we have that (eo,ei) = (T, _L), sect{e/) = sect{ei) = 0 and 
k' = 0 thus satisfying (i),(ii), (iii) and (iv) above. 

For the inductive step, assume that the statement holds for k' < k. From (i) 
it follows that {Tq, cq) =o {Ti, ei), so a next round is played. We then show that 
the Preserver can move so as to maintain the invariant for the next game state 
(eQ,e{) by case analysis on the next move chosen by Spoiler. We only consider 
the case for the EU -moye. The other moves follow analogously. From (ii) we 
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know that sect(eo) = sect{ei) or sect(eo) = sect{ei) — 1, so two subcases arise. 
Subcase I: sect(eo) = sect(ei). Suppose Spoiler chooses to play the EU -move on 
Tq and highlights a future causal chain = (cq = . . . , = Cq). 

By assumption sect(eo) < 2{k + k'). 

Suppose first that Sn < 2{k + A:' + 1). Then Preserver can just copy the move 
and respond with = (ei = = e'l). If Spoiler chooses 

to set the new game state to (eQ,e'^), sect{eJo) = sect{e\) < 2{k + A:' + 1) and 
the invariant is maintained. If Spoiler instead chooses to pick an event 
Preserver would respond by picking and the invariant is maintained in a 

similar manner. 

Suppose then that > 2{k + A:' + 1). Preserver must then “insert” an 
additional occurrence of a section into p^ at section 2(A: + A:' + 1). To bring 
this out, let I be the least index such that si = 2(A: + A:' + 1), which exists by 
assumption. Preserver then responds with 

_ fXo,So Xi,Si + + i Xi + i,Si + i-\-l Xz + 2,SZ + 2 + 1 Xn,Sn+l\ 

k ~ V^l 7 • • • 7 7^1 7^1 7^1 7^1 ) 

with e'l = If Spoiler chooses to set the new game state to (eo,e'^), 

sect(eQ) = Sn = sect{E^ — 1. However, the invariant is maintained as sect(eQ) > 
2{k + A:' + 1) > 2{k — (A:' + 1)) + 1. If Spoiler instead chooses to pick an event 
on Preserver responds dependent upon its index. If Spoiler picks one of 
the first I + 2 events Preserver responds with As sect(eQ'’^') = 

Si = sect(e^'’^") < 2(A: + A:' + 1) the invariant is maintained. If Spoiler picks 
one of the remaining events Preserver responds with in which 

case sect(eQ'’^") = Si = — 1 and the invariant is maintained as 

sect(eQ"’^") > 2{k + A:' + 1) > 2{k — (k' + 1)) + 1. 

Suppose spoiler chooses to play the EU -move on if and highlights a future 
causal chain p^ = (ei = . . . , By assumption sect(eo) < 

2{k + k'). If Sn < 2{k + A:' + 1) then Preserver can, as above, just copy the move 
and maintain the invariant, so suppose that Sn > 2{k + A:' + 1). Preserver must 
then “chop” a duplicate occurrence off p^ around the sections 2(A: + A:') + 1, 2(A: + 
A:') + 2 = 2(A: + A:' + 1), 2(A: + A:') + 3 which exist by construction. Any causal chain 
passing through these three sections must pass (at least) two identical ac-labelled 
or 6c-labelled stretches. Now, let I be the least index such that si = 2(A: + A:') + 1 
and consider the sequence a = (x/, x/+ 2 ,^/+ 4 ) with A(a) G {a, 6}^. Remove from 
a the first occurrence Xi where there exists an j > i with xj in a and Xi = xj. 
Let a' = {xp^Xq) denote the resulting sequence where p^q G {/,/ + 2,/ + 4}. 
Preserver then plays the chain p^: 

f Xo,So Xp,Si C,Si + i Xq,Si + 2 C,Si + s 03^+5, S^ + s-l Xn,Sn-l\ 

V^o 7 • • • 7 ^0 7 ^0 7 ^0 7^0 7^0 7^0 7 • • • 7 ^0 ) 

with Cq = If Spoiler chooses to set the new game state to (eQ,e'^) 

then sect(eQ) = Sn = sect(e[) — 1 so the invariant is maintained because Sn > 
2{k + A;' + 1) > 2{k — (A:' + 1)) + 1. If Spoiler chooses to pick an event on p^ ^ 
Preserver responds according to one of several cases. If Spoiler picks one of the 
first I events then Preserver picks and the invariant is maintained as 

usual. If Spoiler picks either Cq or Cq then Preserver picks either 
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or respectively. As the sections are both or both S/+3 and < 

S/+3 = = 2(A: + A:' + 1) the invariant follows. If Spoiler picks an event, 

say, in before the removed occurrence in a then m G {/,/ + 2} 

and Preserver responds by Then sect(eQ"'’^) = s = sect(e^"'’^) < s/+2 = 

2(A:+A:' + 1). Similarly, if occurs after the removed occurrence then m G {/+ 

2,/ + 4} and Preserver picks Then sect(eQ’^’^) = s = sect(e^ — 1 > 

2{k-\-k') > 2{k—(k'-\-l))-\-l and in both cases the invariant is maintained. Finally, 
if Spoiler picks one of the remaining events with i > l-\-5 then Preserver 

responds with As sect(eQ"’^'~^) = sect(e^"’^') — 1 > 2{k — {k' + 1)) + 1 the 

invariant is also maintained in this case. 

Subcase II: sect(eo) = sect(ei) — 1. Here the futures of cq in and e\ in 
both consist of Ak — sect(eo) factors of [ahc\ and are identical with respect to 
future moves. Hence Preserver can just “copy” the move made by Spoiler. □ 

6 The Expressiveness of TLC* 

Let (A\i) be any trace alphabet with {a,6, c} C U such that a D c and c D b 
but alb. Consider L = [abcabc]* C TR{1j\I) from the previous section. 

Lemma 3. L is definable in TLC*(A,i). 

Proof. Our proof will in fact show that the future fragment of TLC with only 
one future chain quantifier of TLC* suffices to express L. First define 

a[abc]* = f\ r^pd) A EX {pa A EX (pc)) A EX {pb A EX (pc)) A 

dE^ — {a,b,c} 

AG{p, A F^A(tt) ^ EX{pa A EX{p,)) A EX{pi, A EX{p,)). 

It is easy to see that T |= a^abcp T G [abc]*. We will then use existence of 
“zig-zagging” future causal chains to restrict to [abcabc]* C [abc]* below. Define 
the future chain formula fi(acbc)* follows. 

4>(acbc)* = Paf\G{pa ^ X(pc A V(pb A V(pc A (- Vtt V Xpa))))) • 

It’s easy to see that d\e |= iff there exists a future causal chain p 

rooted at e such that \{p) G (acbc)* C X*. The statement of the lemma now 
follows by taking q;l = a^abcp A {r^EXtt V EX{E{(p(^acbcy)))‘ ° 

Putting all the pieces together, we can now state and prove the main result 
of the paper. 

Theorem 4. Let (A’,/) be any trace alphabet. Then 

E TLC(A,i) = TLC*(A,i) if D is transitive. 

2. TLC(A,/) C TLC*(A,/) if D is not transitive. 
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Proof. Obviously TLC(A\i) C TLC*(A\i), so (2) follows easily from Lemma 2 
and Lemma 3 as (a, c), (c, b) ^ D but (a, b) ^ D witness that D is not transitive. 
Hence it suffices to prove (1). 

Let be a trace alphabet with D transitive, i.e. the graph is 

a disjoint union of cliques Thus any trace T G TR{L\I) consists of 

disjoint Q-labelled causal chains only initially connected by T. We can then 
define three mutually inductive translations || * ||e ^;5 II ' II ' 11^ converting 

event formulas, future chain formulas and past chain formulas, respectively, of 
TLC*(A\i) to formulas of TLC(A\i) as follows. 



ba||e^; = Pa and the boolean connectives are as expected. 
|co(a)||e^; = C0(||a||e^;). 

= and||£:-(<A)|U = 



\ch ’ 



lo; 



ch 



c^Wch — ll<a||e^; and the boolean connectives are as expected. 



+ =i?X(|Hiy)and||X-</.|| 

■ 1+ IU/,II + 






ch~^^ \\m\ch 

+ ) and = EU 



ch^ I 



\ch) 



By nested inductions one can show that for each o; G TLC*(i7,/), T, e |= o; iff 
T,e 1= ||Q;||e^;• As ||Q;||e^; ^ TLC(A,i) the required conclusion follows. □ 



It’s easy to show that L ^ FO(A, i) so as a simple corollary we can conclude 
that TLC* (A,/) is not in general included in FO(A,/) even though they’re 
expressively equivalent in the sequential case where 7 = 0. 
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Abstract. Duration Calculus with Iteration (DC^) has been used as an 
interface between original Duration Calculus and Timed Automata, but 
has not been studied rigorously. In this paper, we study a subset of DC^ 
formulas consisting of so-called simple ones which corresponds precisely 
with the class of Timed Automata. We give a complete proof system and 
the decidability results for the subset. 



1 Introduction 

Duration Calculus (DC) was introduced by Zhou, Hoare and Ravn in 1991 as a 
logic to specify the requirements for real-time systems. DC has been used suc- 
cessfully in many case studies. In [4], we have developed a method for designing 
a real-time hybrid system from its specification in DC. In that paper, we in- 
troduced a class of so-called simple Duration Calculus formulas with iterations 
which corresponds precisely with the class of real-time automata to express the 
design of real-time hybrid systems, and show how to derive a design in this 
language from a specification in the original Duration Calculus. We use the def- 
inition of semantic of our design language to reason about the correctness of our 
design. However, it would be more practical and interesting if the correctness of 
a design can be proved syntactically with a tool. Therefore, developing a proof 
system to assist the formal verification of the design plays an important role in 
making the use of formal methods for the designing process of real-time systems. 
This is our aim in this paper. 

We achieve our aim in the following way. First we extend DC with the it- 
eration operator {*) to obtain a logic called DC*, and define a subclass of DC* 
formulas called simple DC* formulas to express the designs. Secondly we develop 
a complete proof system for the proof of the fact that a simple DC* formula D 
implies a DC formula S', meaning that any implication of this form can be proved 
in our proof system. 

To illustrate our idea, let us consider a classical simple example Cas Burner 
taken from [14]. The time critical requirements of a gas burner is specified by a 
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DC formula denoted by S', defined as n[£ > 60s => (20>i< J leak < £)) which says 
that during the operation of the system, if the interval over which the system is 
observed is at least 1 min, the proportion of time spent in the leak state is not 
more than one-twentieth of the elapsed time. 

One can design the Gas Burner as a real-time automaton depicted in Fig. 1 
which expresses that any leak must be detected and stopped within one second, 
and that leak must be separated by at least 30 s. A natural way to express the 
behaviour of the automaton is to use a classical regular expression like notation 

D = {{Ileaki A £ < 1)^(1 nonleaki A£ > 30))* . 

Here we assume that the gas burner starts from the leak state. We will see later 
that is a DC formula with iteration. It expresses not only the temporal order 
of states but also the time constraints on the state periods. 

By using our complete proof system we can show formally the implication 
D ^ S which expresses naturally the correctness of the design. 



leak [0,1] 

nonleak 




Fig. 1. Simple design of Gas Burner 



The class of simple DC* formulas has an interesting property that it is decid- 
able, which means that we can decide if a design is implement able. Furthermore, 
for some class of DC formulas such as linear duration invariants (see [15,11,5]), 
the implication from a simple DC* formula to a formula in the class can be 
checked by a simple algorithm. 

The paper is organised as follows. In the next section, we give the syntax 
and semantics of our Duration Calculus with Iteration. In the third section we 
will give a proof system for the calculus. We prove the completeness of our proof 
system for the class of simple DC* formulas in Section 4. The decidability of the 
class will be discussed in the last section. 

2 Duration Calculus with Iteration 

This section presents the formal definition of Duration Calculus with iteration, 
which is a conservative extension of Duration Calculus [14]. 
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A language for DC* is built starting from the following sets of symbols: a set 
of constant symbols {a, h,c,.. a set of individual variables {x, y,z, . . a set of 
state variables {P, Q: • • -I? ^ set of temporal variables {w, t?, . . a set of function 
symbols a set of relation symbols {P, P, . . and a set of propositional 

temporal letters {A, P, . . These sets are required to be pairwise disjoint and 
disjoint with the set {0, _L, *, 3, J, (, )}. Besides, 0 should be one of the 

constant symbols; + should be a binary function symbol; = and < should be 
binary relation symbols. 

Given the sets of symbols, a DC* language definition is essentially that of 
the sets of state expressions S', terms t and formulas (f of the language. These 
sets can be defined by the following BNFs: 

s = 0 I P I -S I SVS 
t = c\x\u \ /S' I f(t, 

(f = A \ R{t , . . . ,t) I \ {(fW (f) \ I (v?*) I ^x(f 

Terms and formulas that have no occurrences of ^ {chop)^ nor of temporal 
variables, or /, are called rigid. 

The linearly ordered field of the real numbers, 



(R, =r,0r,1r,+r,— R, Xr,/r,<r) , 

is the most important component of DC semantics. We denote by I the set of 
the bounded intervals over R, {[ri,T 2 ] | ri,T 2 G R,ti <r T 2 }. For a set A C R, 
we denote by 1(A) the set {[ri,r 2 ] G I | ti,T 2 G A} of intervals with end-points 
in A. 

Given a DC* language £, a model for C is an interpretation X of the symbols 
of C that satisfies the following conditions: X(c),X(x) G R for constant symbols 
c and individual variables x; T{f) : R’^ ^ R for n-place function symbols /; 
T{y) : I ^ R for temporal variables v] T{R) : R’^ ^ {OP} for n-place relation 
symbols P; X(P) : R ^ {0, 1} for state variable P, and X(A) : I ^ {0, 1} for 
temporal propositional letters A. Besides, X(0) = Or, X(+) = +r, X(=) is =r, 
and X(<) is <r. The following condition, known as finite variability of state^ is 
imposed on interpretations: For every [ri,r 2 ] G I such that r/ < T 2 , and every 
state variable S there exist r(, . . . , G R such that ri = r( < . . . < = T 2 and 

T{S) is constant on the intervals i = 1, . . . ,n — 1. 

For the rest of the paper we omit the index .r, that distinguishes operations 
on reals from the corresponding symbols. 

Definition 1. Given a DC interpretation 1 for the DC language C, the mean- 
ing of state expressions S in C under X, Sx : R ^ {0, 1}, is defined inductively 
as follows: for all r e R 

Ox{r) = 0 

Px{t) = X{P){r) for state variables P 

(-.S')i(r) = 1 - S'i(r) 

{SiV S2)i{r) = max((S'i)i(r),(S'2)x(r)) 
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Given an interval [ri,T 2 ] G I; the meaning of a term t in C under 1 is a number 
defined inductively as follows: 



I-l{c) = 1(c) 

= I{x) 

Ill{v) = I{v){[ti,T2\) 

S) = 1 Sx{r)dT 

WAfih,. . fin)) = I{f)iWAh),. . . ,i:ytn)) 



for eonstant symbols c, 
for individual variables x, 
for temporal variables v, 

for state expressions S, 

for n-plaee funetion 
symbols /. 



The definitions given so far are relevant to the semantics of DC in general. 
The extension to the semantics that comes with DC* appears in the definition 
of the 1= relation below. Let us recall the traditional relation on interpretations: 
For interpretations X, and ff of the symbols of the same DC* language C and 
for a symbol x in we say X x-agrees with J iff T{s) = J{s) for all symbols s 
in but possibly x. 

Definition 2. Given a DG* language C, and an interpretation X of the symbols 
of C. The relation X, [ti^T 2 ] [= (f for [ri,T 2 ] G I and formulas ip in C is defined 
by induetion on the eonstruetion of ip as follows: 



I, [n , T2] 




X 




iH, L , T2] 


N 


A 


tffI{A){[ri,T2]) = 1 for temporal 








propositional letters A 


Hi-, [n , T2] 


N 


R{ti, . . . 


,Tn) fffI{R){i;i{h),.. ■ ,Wl{tn)) = 1 


Ti-, L , T2] 


N 


^Lp 


'tff1,[Tl,T2\ ^ 'P 


'I, [n , T2] 


N 


(x V V') 


iff either I, [ri ,T2] \= p or I, [n, ^2] |= f 


Ti-, L , T2] 


N 


{(p^tp) 


iff Hi, [n , t] 1 = 93 and I, [r, T2] \= ip 








for some r G [d, ^2] 


Hi, ["Tl , T2] 


N 


(x*) 


iff either ri = T2, or there exist r( , . . . , G R 








sueh that ri = r( < . . . < = V2 and 








Hi, iff, ff+i] \=pfori = l,...,n-l 


[n , T2] 


N 


3 xip 


'^ff J 1 [a 7^2] = V some J that 



x-agrees with X 



Note that the clauses that define the interpretation of constructs other than * 
in DC* are the same as in DC. This entails that DC* is a conservative extension 
of DC. 

Let X be a DC interpretation, cp be a DC* formula, and Ji, J 2 , J C I be sets 
of intervals. Let k < uo. We introduce the following notations for our convenience. 

J(^) = {[n,r2] e I I X, [ti,T 2\ \= Lp} 

Ji'^J 2 = {[^1,^2] e I I ( 3 r e R)([ti,t] e Ji a [t,t 2] e J 2 )} 
jO = I r e R} 

Jfe = 

k times 

J* = U J'" 

k<uj 



for A: > 0 
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In words, is the set of intervals that satisfy (/? under X, Ji^J2 is the set of 
intervals that are the concatenation of an interval in Ji and an interval in J2, 
and J* is the iteration of J corresponding to the operation 

In the rest of the paper for the convenience of reading, we use the following 
conventions. We use the customary infix notation for terms with +, and formu- 
las with < and = occurring in them. We introduce the constant T, the boolean 
connectives A, ^ and the relation symbols 7^, >, < and >, and the V quan- 
tifier as abbreviations in the usual way. We assume that boolean connectives 
bind more tightly than Since ^ is associative, we omit parentheses in for- 
mulas that contain consecutive occurrences of Besides, we use the following 
abbreviations, that are generally accepted in Duration Calculus: 



1 = ^0 

lS-\ = fS = £At ^0 
99O = £ = 0 



£= /I 

<>Lp = 

(^+ 2 = 

for A: > 0 

V 

k times 



3 A Proof System for DC* 

In this section, we propose a proof system for DC* which consists of a complete 
Hilbert-style proof system for first order logic (cf. e.g. [ 13 ]), axioms and rules for 
interval logic (cf. e.g. [ 7 ]), Duration Calculus axioms and rules ([ 9 ]) and axioms 
about iteration ([6]). We assume that the readers are familiar with Hilbert-style 
proof systems for first order logic and do not give one here. Here follow the 
interval logic and DC-specific axioms and rules. 



Axioms and Rules for Interval Logic 



(^10 


A ^ (9^ A -'X""V’) 


(^Ir) 


A ^ (y^'~''>P A -'X) 


(^2) 


{{(p'~''>p)'~'x) ^ (^^yv'^x)) 


i^l) 


(f if (f 18 rigid 


(Rr) 


f if f 18 rigid 


i^l) 


(3x(p'^'ip) => 3x{(p'^'ip) if X is not 


(Br) 


{(f'^3x'ip) => 3x{(p'^'ip) if X is not 


i^h) 


{£ = x^(p) = x^^ip) 


(Llr) 


= x) => = x) 


m 


£ = X -\-y <^ (£ = x^£ = y) 


m) 


p ^ {£ = 


(L3r) 


p ^ Ip^£ = 0 ) 








{Monoi) 


P ^ f 


(a>^x) ^ (V'^x) 



free in 7/; 
free in (f 



(Nr) 

(Monor) 



(f ^ f 
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Duration Calculus Axioms and Rules 

(DCl) /0 = 0 
(DC2) Jl = i 
(DCS) fS>0 

(DC4) fSi + fS2 = f(Si V S 2 ) + f(Si A 5-2) 

(DC5) (fS = x-fS = y)^fS = x + y 

(DC6) f Si = f S 2 if Si S 2 in propositional calculus. 

(IN ) ^ ^ [^^\SMA]lp (f ^ [A^\^S^IA]lp 

[T/A]^ 

(IN ) ^ ^ ^ [\S^'^A|A\^ Lp ^ [\^S'\'^AIA\p 

[TlA]p 

, . \/k<uj [([^l V l^S^flA\p 
[T!A]p 

Axioms about Iteration 

(DGl) l = Q^Lp* 

(DCi) 

(DC^) (p* A V'^T) ^ (V> A £ = 0 -^T) V (((<p* A A V')^T). 

The intuition behind DC^ and is quite straightforward. To see 

assume that some initial sub interval of a given interval satisfies 7 /;, and can be 
chopped into finitely many parts, each satisfying (f. Then the smallest among 
the initial subintervals of the given one formed by these parts makes ^tp hold 
exists which is either the 0-length initial subinterval, or otherwise consists one 
that does not satisfy ip. 

A restriction is made on the application of first order logic rules and axioms 
that involve substitution: [t/x](p is defined if no variable in t becomes bound due 
to the substitution, and either t is rigid or ^ does not occur in cp. 

It is known that the above proof system for interval logic is complete with 
respect to an abstract class of time domains in place of R [7]. The proof system 
for interval logic, extended with the axioms DCi-DCq and the rules IRi^ IR 2 
is complete relative to the class of interval logic sentences that are valid on its 
real time frame [9] . Taking the infinitary rule to instead of I Ri and I R 2 yields 
an cj-complete system for DC with respect to an abstract class of time domains, 
like that of interval logic [8]. Adding appropriate axioms about reals, and a rule 
like, e.g., 

\/k < LV kx < 1 
X <0 

where k stands for 1 + . . . + 1 {k times), extends this system to one that is 
cj-complete with respect to the real time based semantics of DC given above. 

In the rest of this section we show that adding DCp-DC^ to the proof system 
of DC makes it complete for sentences where iteration is allowed only for a 
restricted class of formulas that we call simple. The following theorem gives the 
soundness of these axioms. 
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Theorem 3. Let X he a Duration Calculus interpretation. Then X validates 
DCl - 

Proof. The proof about DC^ and DC 2 is trivial and we omit it here. Now 
consider DC^. Let [ri , T2] G I be such that X, [ri, T2] |= and X, [ti,T2] |= 

A i = O^T). We shall prove that X, [ri,r2] [= (((/?* A A ^)^T. We 

have that [ri,ri] ^ X(t/;), and [ri,r] G {x(lp^ nJ( 7 /;) for some k < lo^ and some 

r G [T"i,r 2 ]. Then there exist such that ri = < . . . < = r 

and X, |= c/p for i = 1, . . . , A:. Since [ri, |= and [ri, r{] ^ there 

must be i < A: for which [ri ,r|] ^ and ['ri,'r|+i] 1= Therefore I, ['ri,'r|+i] ^ 

((^* A A which implies that X, [ti^T 2 ] |= ((v?* A A ^)^T. 



Let us prove the monotonicity of * from these axioms which says that if 

4 > ^ A i^hen 0* ^ 7*. 



(j)^ A 



^7* ^ (^7* A £ = 0 -T) V ((( 0 * A 7*-^) A -7*)^T) by 

^ {{{r A A -T*)^T) by DCt 

^ (-.7* A t*)^T by DC2 

and 0^7 



^ T 



The following theorem is useful in practice. The readers are referred to [6] 
for a formal proof of its. 

Theorem 4. 

\~DC^ □((/? => A A □(£ = 0 => q; A /?) => (/?* => . 

Let us now use the proof system of DC* to prove the implication for the 
correctness of the simple Gas-Burner mentioned in the introduction of the paper. 
We have to prove that 

iilleak} A£< l)^([non/eaA:] A£> 30))* ^ D(A > 60 ^ / leak < (l/20)£) . 
Let us denote 

ip = [/eaA:] A £ < l^[^/eaA:] A £ > 30 , 
a = £ = 0 V [^/eaA:] V ([/eaA:] A £ < l^[^/eaA:] A £> 30) , 
f3 = £ = 0V(£<1A [/eaA:]^£ = 0 V [^/eaA:]) . 

From DC axioms it can be proved easily that \~dc □((/?=> ^(T^^o;) A 
^(^/?^T)) and \~dc = 0 => a A f3). Therefore, from Theorem 4 we can 
complete the proof of the above if we can derive that ((a^c/p*)^/?) => 20 / leak < 
£. This is done as follows. 
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1 


q; => 31 J leak < £ 


DC 


2 


(f* A31 f leak > £ ^ (ip* A 31 / leak > 


DC 


3 


ip ^ 31 J leak < £ 


DC 


4 


((p* A 31 / leak > £^T) ^ 

(7 = 0A31 Jleak>£^T)V 

(((ip* A 31 / leak < £^(p) A 31 f leak > £)^T) 


by DC^ 


5 


£ = 0 => 31 f leak < £ 


DC 


6 


{(p* A 31 / leak > £^T) ^ 

(((p* A 31 / leak < £^p) A 31 f leak > £)^T) 


by 4, 5, Monor 


7 


{p* A 31 J leak < £^p) => 31 J leak < £ 


by 2, 3, DC 


8 


(/?*=> 31 J leak < £ 


by 6, 7, Monoj. 


9 


{a^p*) ^ 31 / leak < £ 


by 1, 8, DC 


10 


^ f leak < 1 


DC 


11 A ^ > 60 ^ 20 / leak < i 


by 9, 10, DC, arithmetic 



4 Completeness of DC* Proof System for Simple 
Formulas 

As said in the introduction to the paper, our purpose is to give a rigorous study 
of a class of DC* formulas that play an important roles in practice. The formulas 
in the class are called simple formulas and will be considered to be executable. In 
this section we extend the class of simple formulas, originally introduced in [4] by 
allowing the conjunction in simple formulas. We give a proof of the completeness 
of the axiom system from the previous section for this class of formulas. 

Definition 5. Simple DC* formulas are defined hy the following BNF: 

(f = IS~1 \ a < i \ i < a \ \/ \ A \ \ 

Before giving our main result on the completeness, we should first mention 
that we obtained our axioms DCl — DC^ from propositional dynamic logic [1]. 
We found that there is a certain degree of semantical compatibility between 
interval logic frames and propositional dynamic logic frames, and then built a 
truth-preserving translation of PDL formulas into interval logic ones based on 
this semantic correspondence. We applied this translation to obtain our axioms 
for iteration from the corresponding axioms in PDL. The readers are referred to 
our full report [6] for the details of the translation. 

We are going to show that given simple formula (f and DC* formula 7, DC 
interpretations X that validate 

£ = 0 => 7 

(7» ^ T 

(7 A V-^T) = O-'T) V (((7 A A V')^T) 

for all DC* formulas should satisfy the equality (x{(p)^ = X{j). This means 

that the axioms DC^ — DC^ enforce the clause about iteration in the DC* 
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definition of ^ (Definition 2) for simple formulas ip. We do this in the following 
way: Given the assumption that ^ we find an interval [ri, T 2 ] and 

a formula t/; that refute some of ^ ^ under X. Having found an 

appropriate interval [ri,r 2 ], the formula t/; we need is a *-free one that satisfies 

ni([ri,T2]) = ni([ri,T2]). 

4.1 Local Elimination of Iteration from Simple DC* Formulas 

Elimination of iteration from timed regular expressions^ that are closely related to 
DC* simple formulas, has been employed earlier under various other conditions 
as part of model-checking algorithms by Dang and Pham[5], and Li, Dang [11]. 
Lemma 7, Lemma 8, and Proposition 10 given below are a slightly stronger form 
of Lemma 3.6 from [11]. Because of the space limit, the proof of these lemmas 
which can be found in [6], is omitted here. 

Iteration can be locally eliminated from a formula c/p, if, for every DC inter- 
pretation X and every interval [ri,r 2 ] G I, there exists a *-free formula such 
that X, [ri,T 2 ] 1= □((/? c/p'). 

Due to the distributivity of conjunction and ehop (^) over disjunction, simple 
formulas that have no occurrences of * are equivalent to disjunctions of very 
simple formulas, that are defined as follows: 

Definition 6. Very simple formulas are defined hy the following BNF: 

(/p = £ = 0| [S'] \a<£\£<a \ (ip A ip) \ 

Lemma 7. LetX be a DC interpretation. Let [ti,T 2 ] G I. Let p> be a disjunction 

of very simple formulas that contain no subformulas of the kind a < £ with a ^ 0. 

k 

Then there exists a k < uj such that X, [ri,r 2 ] |= D[ip* \/ p>^). 

j=o 

Lemma 8. LetX be a DC interpretation. Let [ri,r 2 ] G I. Let p> be a disjunction 
of very simple formulas. Then there exists a ""-free simple formula p' such that 

Lemma 9. Let p be a * -free simple formula. Then there exists formula p' which 
is a disjunction of very simple formulas, such that \~dc p ^ p' . 

Proposition 10. LetX be a DC interpretation. Let [ti,T 2 ] G I. Then for every 
simple formula p there exists a "" -free simple formula p' such that X, [ti,T 2 ] |= 
nfp p'). 

Proof. Proof is by induction on the number of occurrences of * in p. Let t/;* 
be a subformula of p and let be *-free. By Lemma 9, \~dc ^ for some 
disjunction of very simple formulas . Now X, [ti, T2] |= fi ") for some 

*-free simple formula by Lemma 8. Hence X, \t\,T2] |= Cl[p p'), where p' 
is obtained by replacing the occurrence of in p by . Thus the number of 
the occurrences of * in c/p is reduced by at least one. 




148 Dang Van Hung and Dimitar P. Guelev 



4.2 Completeness of DC^-DC^ for Simple DC* Formulas 

In this section, we prove that a formula 7 is the iteration of a simple formula ip 
if and only if it satisfies the axioms DC^, DC^ and DC3 for all DC* formulas. 
The following proposition has a key role in our proof. 

Proposition 11. Let X he a DC model that validates ^ ^ and 

^ ^ for some simple DC* formula (p, some arbitrary DC* formula 7, and 

all DC* formulas f). Then J(7) = (x{(p^ . 

Proof. The validity of DC^ and entails that X{^) D (x{(pfj . The proof 

of this is trivial and we omit it. For the sake of contradiction, assume that 
[^1, ^2] ^ \ • By Proposition 10 there exists a simple formula (/?' such 

that X{p}')Cl{[ri,T2]) = {x{p>)^ fl I([ri,r2]). Let . Since X, [ri,r2] |= 7 

and [ri,T2] ^ ^ we have X, [ri,r2] |= 7 A 7/;, and hence X, [ri,r2] |= 

7 A Since [ri,ri] G (x{ip'^ , X, [ri,T2] ^ 7/; A £ = O^T. Now assume 

that X, [ri,T2] H (7 ^ 7 This entails that for some r',r" G [d,^2] 

^7 7 1 = ^^5 and X, [r', r"] |= p>. Then for some k < uj there exist r{, . . . , 

such that ri = < . . . < = r" and X, [r/, ^ c/p for i = 1 , . . . , A:, and 

besides, X, |= f). This implies that [^,^^+1] ^ ^ 

X X \ {xipp^ , which is a contradiction. 

Now let us state the completeness theorem for DC* with iteration of simple 
formulas. 

Theorem 12. Let (p be a DC* formula. Let that all of its * -sub formulas be 
simple. Then either p> is satisfiable by some DC interpretation^ or ^p) is derivable 
in our proof system. 

Proof. Assume that is not derivable. Let P be the set of all the instances 
of DCl-DC^. Then T U {p} is consistent, and by considering occurrences of a 
formula of the form 7/;* as a temporal variable, we have that XU{(/p} is a consistent 
set of DC formulas with temporal variables. By the cj-completeness of DC, there 
exists an interpretation /, and an interval [ti , T2] such that /, [ri, T2] |= X, p. Now 

Proposition 11 entails that I{'ip*) = for all fj such that 7/;* occurs in p^ 

whence the modelling relation I \= p is as required for a DC* interpretation. 

5 Decidability Results for Simple DC* and Discussion 

In this section, we will discuss about the decidability of the satisfiability of simple 
DC* formulas and the related work. 
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One of the notions in the literatures that are closed to our notion of simple 
DC* is the notion of Timed Regular Expressions introduced by Asarin et al in 
[3], a subset of which has been introduced by us earlier in [11]. Each simple 
DC* formula syntactically corresponds exactly to a timed regular expression, 
and their semantics coincide. Therefore, a simple DC* formula can be viewed as 
a timed regular expression. In [3], it has been proved that from a timed regular 
expression E one can build a timed automaton A to recognise exactly the models 
of E in which the constants occurring in the constraints for the clock variables 
(guards, tests and invariants) are from the expression E (see [3]). It is well known 
([2]) that the emptiness of the timed automata is decidable for the case that the 
constants occurring in the guards and tests are integers [2], we can conclude that 
if only integer constants are allowed in the inequalities in the definition of simple 
DC* formulas, then the satisfiability of a simple DC* formulas is decidable. 

Theorem 13. Given a simple DC formula (f in which all the constants occur- 
ring in the inequalities are integers. The satisfiability of ip is decidable. 

The complexity of the decidability procedure, however, is exponential in the size 
of the constants occurring in the clock constraints (see, e.g. [2]). Note that the 
decidability of DC* could be derived from the results in [12] also. 

In [3] it is also shown that from a timed automaton, one can build a timed 
regular expression and a renaming of the automaton states such that each model 
of the timed regular expression is the renaming of a behaviour of the automaton. 
In this sense, we can say that the expressive power of the simple DC* formulas 
is the same as the expressive power of the timed automata. 

If we restrict ourselves to the class of sequential simple DC* formulas then 
we can have a very simple decidability procedure for the satisfiability, and some 
interesting results. The sequential simple DC* formulas are defined by the fol- 
lowing BNF: 

(/? = £ = 0| IS~1 \ ip V (f \ (ip^ip) I (/p* I ip A a < £ \ ip A £ < a 

Because the operators ^and A are distributed over V, and because of the equiv- 
alence {ipWfi)* {ip*"^fi*)*^ each sequential simple DC* formula ip is equivalent 
to a disjunction of simple formulas having no occurrences of V. Therefore ip is 
satisfiable iff at least one of the components of the disjunction is satisfiable. The 
satisfiability of sequential simple DC* formulas having no occurrence of V is easy 
to decide. 

In [11], we have developed some simple algorithms for checking a real-time 
system whose behaviour is described by a ‘sequential’ timed regular expression 
for a linear duration invariants of the form D(a < £ < 6 => 

Because of the obvious correspondence between sequential simple DC* formu- 
las and sequential timed regular expressions, these algorithms can be used for 
proving automatically the implication from a sequential simple DC* formula to 
a linear duration invariant. An advantage of the method is that it reduces the 
problem to several number of linear programming problems, which have been 
well understood. Because of this advantage, in [5], we tried to generalise the 




150 Dang Van Hung and Dimitar P. Guelev 



method for the general simple DC* formulas, and showed that in most cases, 
the method can still be used for checking the implication from a simple DC* 
formula to a linear duration invariant. 

Together with the proof system presented in the previous sections, these 
decidability procedures will help to develop a tool to assist the designing and 
verification of real-time systems. 

It seems that the with the extension of DC with the operator >i<, we can only 
capture the “regular” behaviour of real-time systems. In order to capture their 
full behaviour, we have to use the extension of DC with recursions ([12]). How- 
ever, we believe that in this case the proof system would be more complicated, 
and would be far from to be complete. 
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Abstract. We present a verification methodology for combinational ar- 
ithmetic circuits which allows us to reason about circuits at a high level 
of abstraction and to have better-structured and compositional proofs. 
This is obtained using a categorical characterisation of the notion of data 
refinement. Within this categorical framework we introduce a notion of 
logical relation to deal with a language for hardware description. 



1 Introduction 

This paper presents a methodology for circuit verification, particularly for combi- 
national arithmetic circuits. Our concern here is how to reason in a high-level lan- 
guage about low-level objects. Concrete arithmetic circuits are low-level objects 
typically described with operations on bits. But they are meant to implement 
arithmetic operations such as addition, multiplication, etc. that are meaningful 
at a higher level such as at the level of natural numbers. 

We view these two levels of abstraction as a step of data refinement. The 
low level is seen as the level of the implementation (level of the actual circuits) 
and the high level as the level of the specification. The methodology that we 
present is based on data refinement originated in [7] and its categorical approach 
(e.g. [8], [10]). It uses logical relations as abstraction relations between the two 
levels. The use of logical relations allows us to deal with higher- order structure 
of circuits {circuit combinators) ^ that is, an operation that constructs a circuit 
from other circuits. Defining circuits by means of circuit combinators will allow 
better- structured and compositional proofs. 

2 Circuits, Specification, and Verification 

We begin by introducing a signature of a hardware description language. The 
language is a simply-typed A-calculus with sorts 

T ::= Bn I 1 I T ^ T I T X T 

where {n > 0) is a type of n-bit wire. Its operation symbols are primitive cir- 
cuits, for example, not : Bi Bi, and, or, xor : Bi x Bi Bi, and con- 
verters between n 1-bit wires and one n-bit wire <n • Bn Bi X Bi X ... X Bi, 

n 
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^ B^. We may further include primitive circuit com- 

n 

binators, for example, map : (Ti ^ T 2 ) x T 2 ^ using higher-order sorts. 

So, circuit descriptions are programs in the language. Further, we use Haskell-like 
notation whose meaning should be clear, to write circuit descriptions. 

Example 1. A half-adder circuit hadd is described as 

hadd : Bi x Bi Bi x Bi 

hadd = Xx, (and x, xor x) 

A 1-bit full-adder fadd is made out of two half-adders hadd and an or 

fadd : Bi x Bi x Bi ^ Bi x Bi 

fadd = A(a, 6,c). let (iz, v) = hadd (a, h) 

(re, x) = hadd (u, c) 
in (or (i/, re), x) 

Abstracting out hadd, we obtain the circuit combinator 

^fadd • (^1 X Bi ^ Bi X Bi) x (Bi x Bi x Bi) ^ Bi x Bi 

^fadd ~ v) = h (a, h) 

(re, x) = h (u, c) 
in (or (i/, re), x) 

which represents the connections to build fadd from the components hadd, 
fadd = (pfajjjj(hadd) 

Actual circuits which work at the bit level are meant to implement arith- 
metic functions which are meaningful at a more abstract level of, e.g. natural 
numbers. For example, an adder circuit implements the addition function. The 
specification of a circuit is at the same level of abstraction as the arithmetic 
function rather than at the bit level. This idea raises the question of how to 
develop a verification methodology that allows implementation at a low level 
and reasoning at a higher level of abstraction. 

The verification process begins with interpreting circuit descriptions in two 
levels of abstraction: concrete level (bit level) and abstract level (natural numbers 
level). A concrete- level circuit {eonerete eireuit) represents the actual circuit (the 
implementation). An abstract-level circuit {abstraet eireuit) has exactly the same 
topology as concrete circuit but uses arithmetic functions in place of bit-level 
operations. So, abstract circuits can be verified against the specifications at the 
same high level. Our proposed methodology will make sure that proving the 
correctness of the abstract circuit guarantees the correctness of the concrete 
one. 

3 Preliminaries 

We assume familiarity with the standard notions of category, functor, Cartesian 
closed category (CCC) and its correspondence with simply-typed A-calculus [12], 
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product of two categories, and the category Set of sets and total functions, which 
can be found for example in [1, 13]. 

In this section, we give an account of the background that is necessary for our 
verification methodology like the notion of sketch, the category Rel of binary 
relations and the functor {dom^cod) from Rel to Set x Set. 

3.1 Sketches 

In developing our methodology, we regard the terms of hardware description 
language to be denoting arrows of a certain CCC HW. The purpose is to use the 
characterization of HW to obtain an appropriate (logical) relation connecting 
the concrete and abstract levels. 

We can represent a signature of the hardware description language as a di- 
rected graph whose nodes are primitive sorts and whose arrows are primitive 
operations on values of the sorts. An operation with several arguments can be 
represented by an arrow from a node representing the Cartesian product of all 
the sorts of the arguments. Similarly, a higher-order operation can be represented 
by an arrow from a node representing a function space built from other sorts. In 
this case, we need more than just a graph, we need some equations stating that 
a certain node must be constructed from other sorts. 

A sketch is a directed graph with some equations; it is a way of expressing a 
structure. We refer the reader to [1] for an introduction to the concept of sketch. 
There are many kinds of sketch; the one used in this paper is Cartesian closed 
(CC) sketch, which is an instance of the general sketch presented in [11] 

Definition 1. A CC sketch S = (G, E) consists of a graph G and a set E of 
equations of the form c = o;(ci, • • • , c^) where c, Ci are nodes or edges 
of G and o; is an expression built from operations of CCC structure (com- 
positions o, identity arrows id_, products x, pairing {—,—), projections tt , 

the terminal object 1, the unique arrows to 1 exponentials currying 
(— )*; evaluation maps ev__^. A model M of S in a Cartesian closed category 
C is a graph homomorphism from G to the underlying graph of C such that 
Me = q;(Mci, •••, Mcn) holds in C. 

Hence, a model of the sketch S = (G, E) in the category Set interprets sorts 
and operations as sets and functions, respectively. Further, the interpretation 
satisfies the equations stated in E. For instance, when X = A x B is given in 
E^ the interpretation of X is the product of the interpretations of A and B. 

For any CC sketch S = (G, E) there is the CC category Ecc{S) which is 
freely generated by S subject to the equations in E [11]. The CC category Ecc{S) 

comes with a model i : S Ecc{S) and has the following characterization: 

for every Cartesian closed category C and every model Eq : S C there is a 

unique strict CC functor E : Ecc{S) C for which E o t = Eq. 

Intuitively this property tells that the objects and the arrows of Ecc{S) are 
inductively generated by Cartesian closed operations from the nodes and the 
edges of G, respectively, subject to the equations in E and the axioms of CCC. 
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The model i is then given by inclusion of S in Tcc{^) modulo the equations. It 
also tells that F is given by Fq for primitive sorts and operation symbols, and 
by structural recursion for constructed sorts and terms. 

3.2 Category of Relations 

We regard a binary relation as a triple (M, A\ R) where A and A' are sets and 

R 

R C A X A' . We use the notation A ^ ^ A' to mean is a binary relation 
over A and M', and a R a' to mean (a, a') G R. 

Definition 2. The category Rel has as objects binary relations {A^A\R) and 
an arrow from {A^A\R) to is a pair of functions {f \ A^ B^ f': 

A' B') such that Va G A. Va' G A', if a R a' then fa S f'a\ We use the 
following diagram for describing that condition. 




Composition and identities are given component- wise. 

It is routine to verify that the category Rel is Cartesian closed with 

— the terminal object is a triple whose components are singletons: 

^Rel = (I 5 15 1 X 1 ) where 1 is the singleton {*}, 

— product is defined component-wise: 

(M, A',R)x (R, B', S) = {Ax B, W xB',Rx S) where (a, b) R x S (a', b') 
if and only if a R a' and b S b' ^ and 

— exponentiation: 

[{A,A',R) (B,B',S)] = {[A B], [A' B'], [R 5]) where 

f [R ^ S] /' if and only if Va G A. Ma' G A' . a R a' ^ faS f'a' 

We denote a forgetful functor (dom, cod) from Rel to Set x Set sending 
{A,A',R) to {A, A') and {f,g) to (/, ^). Note that Set x Set is a CCC with the 
structure given component- wise. It is straightforward to verify that (dom, cod) 
is a strict Cartesian closed functor. 

4 Circuit Descriptions 

Both the abstract- level and the concrete-level versions of a circuit have the 
same topology given by the circuit description. We now introduce the cate- 
gory HW whose objects are sorts of the language and whose arrows are circuit 
description (or simply circuits). We start with the CC sketch H of primitive 
circuits, then we define the category HW to be the free CCC on it. 
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The carrier graph Hq of H has as nodes (n > 0) of n-bit wire types, 
together with etc. which represent product Bi x Bi and 

exponential [Bi ^ Bi], respectively. The edges of Hq are the names of primitive 
circuits, including among others are not, and, or, and xor. The primitive circuit 
not is an edge from Bi to Bi, whereas primitive circuits and, or, and xor are 
edges from to Bi. We also consider the conversions between n 1-bit wires 

and one n-bit wire: 

^ V' 

- >n ■■ N(((g^ ^ g^) ^ ^ g^) ► 



The equations of the sketch H simply specify = Bi x Bi, N^g^^g^l = 

[Bi ^ Bi], etc. 

For the rest of the paper, we use convention that the product associates to 
the left, so Bi x Bi x Bi means (Bi x Bi) x Bi. 

The reason we consider both B^ and Ng^ ^ g^ ^ x Bi abstract 



interpretation of B^ is not necessarily isomorphic to the n-fold product of that 
of Bi. Thus, we do not demand the equations <n o i>n = and \>n ^ <n = 



Definition 3. A category HW is the free Cartesian elosed category Tcc{H) 
on the sketch H . 

So, HW has a terminal object 1 and for each pair of objects A and 5, a 
product A X B and an exponential object [A ^ B]. The products allow us to 
express multiple input /output and the exponentials enable us to have circuit 
combinators which can be used, for instance, to represent the connections of 
a circuit. The arrows in HW are given by canonical arrows of CCC (id_, !_, 

7T__, 7T^_, ev ), compositions (/ o ^), pairing ((—,—)), currying ((—)*), and 

the primitive circuits in H. We omit the canonical isomorphisms of CCC, e.g. 
(AxB) xC ^ Ax (B xC). 

With the well-known correspondence between CCC and A-calculus [12], we 
identify arrows of HW and circuit description. 



Example 2. A 2-bit full-adder fadd2 is made out of two full- adders fadd. 



fadd2 : B 2 x B 2 x Bi B 2 x Bi 

fadd2 = apply, 
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where 

^fadd 2 ' X Bi X Bi ^ Bi X Bi] x (B2 x B2 x Bi) ^ B2 x Bi 

‘^fadd2 = let (ai, ao) = >2 a 

{bi, bo) = >2 b 

{u, v) = f (ao, bo, c) 

(w, x) = f (ai,bi, u) 
in (w, <2 (x, v)) 

and is an abbreviation for application of (f that involves con- 
versions of Ci : A B to global elements {Ci o 

Without having higher-order structure (circuit combinators), proving the cor- 
rectness of fadd2 involves proving the correctness of fadd for specific arguments 
(fadd (ao, c) and fadd (ai, U2, u)). This is not the case when we have cir- 
cuit combinators. Instead, we prove the correctness of fadd and then instantiate 
the proof for specific arguments. 

5 Circuit Interpretations 

Definition 4. An interpretation of the cireuit deseriptions is a Cartesian elosed 
funetor from HW to Set . 

So, the interpretation of a circuit description (an arrow in HW) is a function. 
By the freeness of HW, an interpretation F is uniquely determined by a model 
of Fq of the CC sketch H in Set. 

If we interpret a circuit description at the level of bits, we get the correspond- 
ing eonerete cireuit representing an actual circuit. Similarly, by interpreting a 
circuit description at the level of the specification of the circuit, for instance 
natural numbers level, we get the corresponding abstract circuit. 

We define the concrete model Cq of H in Set by Co(Bi) = Bit (= {1, 0}) 
and Co (Bn) = Bit’^ for nodes, and for edges: 

note • Bit ^ Bit 

note = if {x == 1) then 0 else 1 

ande, ore, xore : Bit x Bit Bit 

ande = A(x, y). if {x == 1 A y == 1) then 1 else 0 
ore = A(x, y). if {x == 1 \f y == 1) then 1 else 0 

xore = A(x, y). if (x ^ y) then 1 else 0 

<nC^ »n : Bit"" ^ Bit"" 

<nC = id 

>nc = id 

The concrete interpretation C is the unique strict CC functor with Cot = Cq. 
In other words, C is defined by structural recursion with base cases given by 
Co- So, the concrete interpretation C of the primitive circuits are given by the 
concrete model Cq. The concrete interpretation of 1-bit full- adder fadd after 
unfolding hadd is 




Logical Relations in Circuit Verification 



157 



C(fadd) : Bit x Bit x Bit ^ Bit x Bit 

C(fadd) = A(a, 6, c). let (ir, v) = (andc (a, 6), xorc (a, h)) 

[w, x) = (andc iy, c), xorc iy, c)) 
in (ore 

Similarly, we define the abstract model Aq of H in Set by A(B±tn) = Nat 
(the set of natural numbers) for nodes, and for edges: 

not A : Nat ^ Nat 

not A = Ax. (succ x) mod 2 

andA, orA, xofa : Nat x Nat ^ Nat 

andA = A(x, y). {x + y) div 2 
orA = A(x, y). (succ (x + y)) div 2 
xorA = A(x, y). (x + y) mod 2. 

<nA • Nat ^ Nat X Nat X ... X Nat 

<inA = Ax. ((x div mod 2, ... , (x div 2^) mod 2) 

>nA • Nat X Nat X ... X Nat ^ Nat 

^nA — A(xr, ... , Xfi^. X\ X 2 “h ... “h Xfi X 2 

Similarly, the abstract interpretation A is the unique strict CC functor with 
Ao L = Aq. So, the abstract interpretation A of the primitive circuits are given 
by Aq. The abstract interpretation of 1-bit full-adder fadd after unfolding hadd 
is 

A(fadd) : Nat X Nat X Nat Nat X Nat 

A(fadd) = A(a, 6, c). let (n, v) = (andA (<^, ^), xofa (n, b)) 

(re, x) = (andA {v, c), xofa (x, c)) 
in (or A (n, rc), x) 

We use the subscripts c and a to denote concrete and abstract circuits, 
respectively. 

6 Verification Methodology 

Given the description of a circuit and its specification, the verification process 
starts with interpreting the circuit description at the level of implementation (bit 
level) and at the level of specification (natural numbers level). We then get the 
corresponding concrete circuit and abstract circuit. Our verification consists only 
of proving, at the same high level of abstraction, that the abstract circuit satisfies 
the specification. The correctness of the concrete circuit follows automatically, 
as we explain now using a correctness criteria from data refinement. 

The correctness criteria for a circuit description says that if the concrete 
circuit and the corresponding abstract one have appropriately related inputs 
then the two circuits yield appropriately related output. We use binary logical 
relations to relate concrete values and abstract values. The reason for using 
logical relations is to be able to cope with higher-order structure of circuits 
[circuit combinators) . The approach that we will use is based on a categorical 
approach to data refinement [10]. 
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6.1 Logical Relations 

We present here the notions of logical relation between two interpretations of a 
circuit. Discussion on logical relations can be found in [14]. 

Definition 5. Letl^ J he two interpretations of circuit descriptions. Let {Ra) 
be a family of binary relations {I {a) ^ J {a) ^ R^) indexed by object o; G HW. We 
say that {R^) is a logical relation between the two interpretations I and J if 

— a Ri b holds for (the unique) a G /(I) and be J(l) 

— (a, 6) Raxp ^ RcxCi' b Rp b' , for all a G I{cx), 

a' G J{a), b G /(/?) and b' G J(/?) 

— / R[a^f3] f' if and only if Va G I {a). \/a' G J{a). a Rc^ a' fa Rp f'a' , 
for all f G I{[a /?]) and f' G J{[a /?]). 

Observe that from the definition of the Cartesian closed category Rel, these 
three conditions are equivalent to R^ = IrcU Raxp = Ra x Rp and = 

[Ra Rp]. 



6.2 Correctness Criteria 



Showing that the correctness of concrete circuits follows from the correctness 
of the abstract ones amounts to finding an appropriate logical relation {Ra) 
between the two interpretations C and A such that for every circuit description 
circuit in HW the diagram 



^ . , X Rinputs 

e {inputs ) \ 

C {circuit) 

C {outputs) ^ 

Routputs 



^ A{inputs) 

A{circuit) 
^ A{outputs) 



commutes in the sense described in the category Rel, that is, that 
C {circuit) R^inputs — youtputs] A{circuit) 

Thus, a family {Ra) gives, for every object a in HW, an object 

{C {a) , A{a) , Ra) in Rel such that for every arrow c : a ^ fd in HW 

{C{c),A{c)) is an arrow in Rel from {C {a) , A{a) , Ra) to {C{fd), A{fd), Rp). In 
other words, a logical relation R between the two interpretations C and A 
is equivalent to a Cartesian closed functor R from HW to Rel such that 
{dom^cod) o R = (C, M), i.e. the following commutes. 

Rel 

cod) 

HW, - Set X Set 

{C,A) 

Recall that C and A are the interpretations uniquely corresponding to the 
models Co and Mq, respectively. The basic lemma of logical relations is as follows. 




Logical Relations in Circuit Verification 



159 



Lemma 1 (Basic lemma of logical relations). Given any model Rq of the 
CC sketeh H in Rel with {dom^cod o Rq = (Co, Aq), the unique CC funetor 
(interpretation) R with R o t = Rq satisfies {dom^ cod) o R = (C, A), i.e., 
R is a logical relation between the two interpretations C and A from HW to 

Set X Set. 



Proof. It follows from the fact that HW is the free CCC on the sketch H and 
that (dom, cod) is a strict CC functor. 



Intuitively, this lemma says that the commutativity of the diagrams for com- 
pound circuits follows from the structural similarity between the abstract and 
concrete circuit and from the commutativity of the diagram for their primitive 
circuits. 

To formulate the correctness criteria, what remains is to define appropriate 
for each n and to prove the commutativity of the diagram for the primitive 
circuits. For example, the diagram for and: 



C(Bi X Bi) = Bit X Bit 

andc 

C(Bi) = Bit ^ 



X 



R 






Nat X Nat = ^(Bi x Bi) 
and A 

Nat = A{Bi) 



Note that outputs of and^ on inputs which are not in the domain of R is 
irrelevant. 



Definition 6. R^ , a relation over Bit and Nat, relates 1 with 1 and 0 with 0. 

a relation over Bit’^ and natt, relates x G Bit’^ and x' G Nat iff x is the 
binary representation of x ' . 



The relation R does not need to be a relation over Bit and Nat. It can be 
defined as a relation over other two representations of numbers as long as it 
appropriately relates the two representations. 



Example 3. Consider 1-bit full- adder fadd. By Lemma 1, commutativity of the 
diagrams for the primitive circuits guarantees that of 

Rd X Rd X Rd 

Bit X Bit X Bit ^ ^ Nat X Nat X Nat 



faddc 



fadd A 



Bit X Bit 



-^Bi ^ 



^ Nat X Nat 



Our verification task is then to prove fadd a agrees with the specification 

faddspec = A(a, 6, c). ((a + 6 + c) div 2, {a b c) mod 2) 

for 0 < a^b^c^< 1. We write faddA ^ faddspec to mean faddA agrees with 
faddspec on relevant inputs in the domain of the logical relation. 
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6.3 Compositional Verification 

Even at the high level of abstraction, formally proving that an abstract circuit 
meets its specification is not easy to do directly, particularly when the circuit 
is large. We propose here to organise a large verification as the composition of 
smaller verifications of component circuits. This also allows reuse of component 
verification and high-level verification of circuits with many different component 
implementations. 

For the purpose of explanation, we take the simple example of a 2- bit full- 
adder fadd2 having two 1-bit full-adders fadd as components. That C(fadd2) 
is appropriately related to M(fadd2) is automatic, but we must prove that 
M(fadd2) agrees with 

fadd2spec = A(a, 6, c). a + 6 + c 

The problem is that fadd2spec does not have the same structure as fadd2 but 
we wish to use a lemma about fadd. 

This can be naturally resolved by dividing the verification in two steps. 
Firstly, we verify fadd, i.e., prove that M(fadd) ^ faddspec- Then, we con- 
sider fadd to be a primitive operation and extend our sketch H to H' includ- 
ing fadd. The concrete interpretation C' : HW' Set of now extended 

category HW' is given by setting C'(fadd) = faddc- For the abstract inter- 
pretation, we set M'(fadd) = faddspec, which is a simpler and higher- level 
formula than faddyi. By the first step, C'(fadd) W(fadd), hence so is 

C'(fadd2) = C(fadd2) ^ W(fadd) by Lemma 1. Thus, our ve- 

rification is simplified to proving that W(fadd2) ^ fadd2spec where W(fadd2) 
is simpler formula than M(fadd2). 

The same organisation of proofs can be obtained by considering higher-order 
circuit combinators. We consider ^fadd2 combines fadd taken as an argu- 
ment. The above amounts to organising the proof of M(fadd2) ^ M(fadd2spec) 
as 

^(fadd2) = ^((pf^jjjj 2 )(^(fadd)) 

^ ^(‘Pfadd2)(^^‘^‘^spec) 

^ fadd2gpec 

Note that, without higher-order types in the language, the two occurrences of 
fadd iin fadd 2 must be treated separately. 



7 Concluding Remarks 

We have presented a methodology for describing and verifying combinational 
arithmetic circuits. It is based on a category theoretic approach to data refine- 
ment which uses logical relation as abstraction relation. Our methodology allows 
us to reason about circuits at a high level of abstraction. Additionally, having 
higher-order structure allows us to have compositional proofs. 




Logical Relations in Circuit Verification 



161 



The idea of using data or program refinement in circuit verification has been 
explored before. Burch and Dill [3] use specification state and implementation 
state spaces connected by abstraction mapping. Their correctness criteria dia- 
gram is similar to ours. There, implementations is verified directly against spec- 
ifications. Cyrluk [4] uses abstraction mapping between specification state and 
implementation state spaces to guide in structuring proof by transforming terms 
involving implementation state variables into equivalent specification state vari- 
ables. Velev and Bryant [18] combine the approach of Burch and Dill with ab- 
straction methods for functional units in microprocessors. Jones and Sheeran [9] 
view circuit design as program refinement. Hanna et al [6] use the techniques of 
data abstraction to build up specifications in a structured way. 

Various uses of category theory as a guide in circuit verification have been 
explored. Brown and Hutton [2] used category theory to formalise the use of 
picture in circuit design. Fourman [5] and Sabadini et al. [15] have developed 
categorical models of sequential circuits. Sheeran [16] has shown that category 
theory can be used to derive useful theorems about hardware verification. 

Further works are to employ the methodology in practice to verify non-trivial 
arithmetic circuits and to extend the methodology for verifying sequential cir- 
cuits. For these, we enrich the hardware description languages in terms of both 
practicality and expressiveness, e.g. polymorphism and dependent types. 
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Abstract. In this paper, we study two lemma methods for accelerat- 
ing Loveland’s model elimination calculus: One is lemma generalization 
and another is non-unit lemma matching. The derivation of lemmas in 
this paper is a dynamic one, i.e., lemma generation is repeatedly per- 
formed during an entire refutation search process. A derived lemma is 
immediately generalized by investigating the obtained subproof of the 
lemma. The lemma generalization increases the possibility of successful 
applications of the lemma matching rule. The non-unit lemma match- 
ing is an extension of the previously proposed unit lemma matching, 
and has the ability for stably speeding up model elimination calculus 
by monotonically reducing the refutation search space. We have imple- 
mented a PTTP-based theorem prover, named I-THOP, which performs 
unit lemma generalization and 2-literal non-unit lemma matching. We 
report good experimental results obtained with I-THOP. 



1 Introduction 

The lemma facility in top-down theorem proving has been pointed out to be quite 
important and beneficial in order to avoid a redundant computation. Lemmas 
are extra clauses which are derivable during a proof search, and can be identified 
with initial axiom clauses once they are produced. The use of lemmas makes it 
possible for a top-down theorem prover both to find a shorter proof and to cut 
off duplicate computations. 

Many first-order theorem provers utilize various sorts of lemma facility [1, 
2, 3, 4, 7, 8, 10, 11, 12, 13, 14, 16, 18, 20, 21, 22]. However, the naive use of 
lemmas often increases the breadth of then search space. It consequently causes 
an explosion, and results in loss of efficiency of provers in many cases [7, 23, 
24]. Automated theorem provers had been struggled for extreme deterioration 
and instability of the proving performance caused by lemmata for a long time. 
However several technologies for overcoming such difficulty have been proposed 
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in this decade. Caching, proposed by Astrachan and Stickel [3], is a modified 
method of lemmas. It succeeds in drastically reducing the search space of Horn 
problems. However caching has no effect for non- Horn problems. We proposed 
various unit lemma matching rules for model elimination calculus [11, 12, 13], 
which are applicable non-Horn problems, and can stably accelerate a refutation 
search process by monotonically reducing the search space. In this paper, we 
shall extend this unit lemma matching method in two manners: One is lemma 
generalization, and another is an extended matching which is applicable to non- 
unit lemmas. 

The lemma generation process of this paper is a dynamic one, i.e., lemmas 
are repeatedly derived during an entire proof search. Generated lemmas are gen- 
eralized, and are stored into a lemma database. The lemma generalization is a 
subproof-based one, i.e., it can be performed by investigating the structure of the 
obtained subproof of a lemma. The computational cost of this subproof-based 
generalization is quite low. This real-time generalization of a lemma obtained 
at a search stage clearly increases the possibility of controlled use of the lemma 
at the later stages. Such a dynamic lemma generalization technique has never 
been studied, up to our knowledge, in the literature of fully- automated theo- 
rem proving technology.^ First we describe the lemma generalization method in 
a general form. We have implemented a PTTP-based theorem prover, named 
I-THOP, which can perform unit lemma generalization. Second we give an ex- 
perimental evaluation of the good performance of this restricted form, i.e., unit 
lemma generalization with I-THOP. 

Next, we give the non-unit lemma matching, which is an extension of the unit 
lemma matching [11]. The non- unit lemma matching never broadens the breadth 
of the search space, and makes it possible for a top-down theorem prover both 
to find a shorter proof and to cut off duplicate computations. 

Roughly speaking, non-unit lemma technology is classified into two sub- 
categories: One is the local use of non- unit lemmas such as C-reduction, folding- 
down operation [14, 22] and anti-lemma [10, 14], and another is the global one. 
Although little research has been achieved on the global use of non-unit lemmas 
until now, Fuchs [8, 9] investigated a controlled global use of non- unit lemmas. 
However Fuchs’ work is an extension of Schumann [21], and its lemma genera- 
tion method is static one, i.e., lemma production is performed only once at the 
preprocessing stage. The non-unit lemma matching studied in this paper is a 
sort of the global use of non-unit lemmas with the dynamic lemma generation. 
Compared with the static one, the dynamic generation has the great advantage 
that several difficult lemmas can be derived by using other lemmas obtained at 
the previous stages. Such a nested lemma generation quite often occurs in real 



^ Interactive theorem provers, especially the ones using the induction principle, some- 
times perform the generalization of intermediately obtained lemmas [5]. Also some 
similar techniques such as explanation-based learning have been studied in the field 
of machine learning [19]. 
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theorem proving processes.^ Up to our knowledge, the global use of non- unit 
lemmas together with dynamic generation has also never been studied up to 
this point. We implemented 2-literal non-unit lemma matching rule in I-THOP, 
and show some good experimental results for this restricted version of non-unit 
lemma matching. 

This paper is organized as follows: Section 2 is preliminaries. In Sect. 3, firstly 
we give a dynamic lemma generation method; secondly we show a lemma gener- 
alization technology which is based on the investigation of a subproof; finally we 
give the non-unit lemma matching rule. Section 4 shows experimental results of 
the performance of the dynamic lemma generation, the unit lemma generaliza- 
tion and 2-literal non-unit lemma matching. Section 5 is the conclusion. 

2 Preliminaries 

We briefly introduce Tableau Model Elimination (TME) [8, 14], (or sometimes 
called eonneetion tableau ealeulus)^ which is a generalization of the chain-based 
model elimination calculus originally proposed by Loveland [15]. 

A clause is a multi-set of literals, usually written as a disjunction LiV. . M 

Definition 1. A tableau T for a set S of clauses is a tree whose non-root nodes 
are labeled with literals and these labels satisfy the condition: if the immediate 
successor nodes A^i, . . . , Nn of a node of T are labeled with literals Li, . . . , 
then the clause Li V . . . V {tableau clause) is an instance of a clause in S. A 
tableau is called a Model Elimination (ME) tableau^ if, for every inner node N 
(except the root) labeled with a literal L, there is an immediate successor node 
N' of N such that N' is labeled with a literal L' complementary to L. 

The trivial tableau is a one- node tableau with a root only. If no confusion 
arises, we shall identify a node of a tableau with the literal associated with 
N in an appropriate manner. 

Let T be a tableau. A node N dominates a node N' in T if {N'^ N) is in the 
transitive closure of the immediate predecessor relation. An open branch in T is 
a branch of T not containing two complementary literals. A closed tableau is a 
tableau which does not have any open branches. A subgoal of T is a literal at the 
leaf node N of an open branch in T. We also sometimes call N itself a subgoal. 

Given a tableau T, a subgoal L and a clause C, then the tableau expansion of 
T in T with C is the following operation: firstly obtain a new variant Ti V. . . 
of C] secondly attach new leaf nodes TVi, . . . , Nn as immediate successors of L 
in T; finally label these new leaves with the literals Ti, . . . , respectively. 

Definition 2. Given a set S of clauses, the inference rules of TME for S consists 
of the followings: 

^ We indeed confirmed this phenomenon through many experiments using TPTP 
library, by examining the tableau proofs produced by I-THOP. 

^ An ME tableau is sometimes called a connection tableau. 




166 



Koji Iwanuma and Kenichi Kishino 



Start. Given a trivial tableau, select a clause C from 5, and apply the tableau 
expansion rule with C to the root of the tableau. 

Reduction. Given a tableau T, select a subgoal L in T and a node L' dominat- 
ing L such that there is a most general unifier 0 of L and then apply 0 
to the whole T. We say, L is reduced by the ancestor L' . 

Extension. Given a tableau T, select a subgoal L in T and a clause C from 
5, apply a tableau expansion step with C to L and immediately perform a 
reduction step with L and one of its newly created successors. 

Given a ME tableau, the TME inference rules clearly generates only ME 
tableaux. Throughout the rest of this paper, we assume any tableaux are ME 
tableaux, thus we simply use the word tableau for ME tableau. 

Definition 3. Let 5 be a set of clauses, and T and T' be two tableaux. A 
sequence Ti, . . . , is called a TME derivation in S from T to T' if 

1. Ti = T and = T' hold, and 

2. T^+i (i = 1, . . . , n — 1) is obtained from Ti by means of one of TME inference 
rules of 5, i.e., a start step, an extension step or a reduction step. 

A TME refutation from T is a TME derivation from T to a closed tableau. 
Einally, a TME refutation o/ 5 is a TME derivation in S from a trivial tableau 
to a closed tableau. 

In the following, an inference I performed in a TME derivation is written as 
a tuple ( 5 , (r^a)) where s is the subgoal, r specifies the inference rule applied 
to 5 , and a indicates the input clause applied in the extension or start rules or 
the ancestor literal used in the reduction. Eurthermore, we sometimes identify a 
TME derivation Ti, . . . , with a sequence A, . . . , In-i of inferences such that 
Ij has been applied to Tj (j = 1, . . . , n — 1). 

Theorem 1 (Completeness [14, 15]). Let S be a set of clauses. S is unsat- 
isfiable iff there is a TME refutation of S . 



Example 1. Eig.l depicts an example of a closed tableau for the set of clauses. 

p{X)yq{X), p{X)y ^q{X)y ^r{X), r(a)} 

The dotted line in Eig.l denotes the reduction of a subgoal with its ancestor 
which is not an immediate predecessor. 

3 Lemmas in Model Elimination 

Lemmas are extra clauses which are derivable during a proof search, and can be 
identified with initial axiom clauses once they are produced. The use of lemmas 
makes it possible for a tableau-based theorem prover both to find a shorter proof 
and to cut off duplicate computations. 
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► ~p(a) 




p(a) ~q(a) 




p(a) q(a) 

* * 



(a) 



r (a) 



* 



Fig. 1. A closed ME tableau 



3.1 Dynamic Lemma Generation 

Clausal non- unit lemmas can be created from a closed subtableau as follows: 

Definition 4. Let T be a non-trivial tableau and L be a literal. We say T is 
an ME tableau with the head L, denoted as T^, if the root node of T is labeled 
by L and there is an immediate successor labeled with a literal L' which is 
complementary to L. 

If Ts is a subtree of a tableau T, and contains only a node N labeled by L 
and all nodes dominated by N ^ then Tg is called a subtableau of T with the head 
L. A lemma can be produced from a subtableau Tg of T if Tg is closed in the 
context of T. 

Definition 5 (Lemma). Let T be a tableau, and be a subtableau of T 
with the head L. Suppose contains no open branches of T, and Mi, . . . , 
are all leaf literals in such that Mi (i = 1, . . . , n) is reduced by an ancestor 
literal which dominates L, i.e., an ancestor literal occurring outside T^. Then 
the clause V Mi V ... V M^ is called a lemma produced by the subtableau . 

Notice if there are no leaf literals in reduced by ancestor literals outside 
^ then the produced lemma becomes a unit clause. 

Proposition 1. Let S be a set of clauses, T be a tableau of S, and C be a lemma 
produced by a subtableau Tg of T . Then C is a logical consequence of S . 

We dynamically and repeatedly perform the above lemma generation proce- 
dure during the search of a TME refutation. The dynamic generation enables us 
to derive a difficult lemma, which ordinarily needs a huge subtableau, by using 
other lemmas obtained at the previous stages. 




168 



Koji Iwanuma and Kenichi Kishino 



3.2 Lemma Generalization 

Let us consider the generalization of lemmas. Suppose L\ and L2 are literals. We 
say Li is more general than (or simply, subsumes) L2 if there is a substitution 
0 such that LiO = L2. 

Definition 6. We say a tableau T\ is more general than a tableau T2, denoted 
by Ti < T2, if the tree structures of T\ and T2 are isomorphic and there is a 
substitution 0 such that LiO = L2 for any pair of corresponding nodes L\ and 
L2 in Ti and T2, respectively. 



Definition 7. Let 5 be a set of clauses. Suppose C\ and C2 are lemmas pro- 
duced by tableaux T\ and T2 of 5, respectively. We say C\ is more proof-based 
general than C2, denoted by C\ C2, if T\ < T2. We denote C\ C2 if 
Cl <p C2 but not C2 <p Cl. 

Given a set S of clauses, the relation obviously constitutes a well-founded 
order. Thus there is a most proof-based general lemma for each lemma C, denoted 
as mpg(C). That is, mpg(C) is a clause satisfying the condition: 

1. mpg(C) <p C, and; 

2. there is no clause C such that C mpg(C). 

The clause mpg(C) is unique modulo renaming variables. 

Given a lemma C produced by a tableau in a TME derivation, the clause 
mpg(C) can easily be obtained by investigating the inference steps needed for 
constructing . 

Definition 8. Let D be a TME derivation for a tableau T, and be a sub- 
tableau of T with the head L. The inference sequence for is the order- 
preserved sequence fy , . . . , /^ of all inferences of D such that Ij ( j = 1 , . . . , n) 
has been performed to either the node L or a node dominated by L in during 

D. 



Obviously, the first inference step in any inference sequence must be per- 
formed to the head node of the subtableau. 

Definition 9. Let L be a literal, p be the predicate symbol of L and n be the 
arity of p. If L is positive, then the skeleton of L is the literal p(Xi, . . . ,X^); 
otherwise ^p(Xi, . . . , X^). 

Clearly the skeleton of L is the most general form of L. 

Definition 10 (Generalization Algorithm). Let T be a tableau, be a 
subtableau with the head L, and I = fy, . . . , /^ be an inference sequence for . 
Obtain a generalized subtableau^ gene(T/"), from and I by performing the 
following procedure: 




Lemma Generalization and Non-unit Lemma Matching 



169 



1. First, get a tableau Ti by applying the first inference Ii to the one-node 
tableau labeled by the skeleton of L. 

2. Obtain the tableau T^+i, for each i = 1, • • • , n — 1, as follows: 

(a) If 7^+1 is a reduction step with an ancestor literal outside T^, i.e., an an- 
cestor literal dominating L, then skip the inference step and return 
the current Ti as T^+i; 

(b) Otherwise, perform simply the inference step to the corresponding 
subgoal in the tableau T^, and return the resulting tableau as T^+i. 

3. Finally, return the resulting tableau as the answer gene(T/^). 

Let C = ^LVMiV. . .VM^ be a lemma produced by a subtableau . Clearly, 
the head literal L' of gene{T^) is more general than L. Moreover there are some 
literals M(, . . . , in gene(T/^) corresponding Mi, . . . , Mj^, respectively, and 
each M- {i = 1, • • • , /c) is more general than the corresponding Mi. The clause 
C' = ^L' V M[ V ... V M^ is called a generalized lemma produced by gene{T^). 
We say C' is properly generalized if C C. 

Example 2. Consider the set S of clauses containing the three clauses: 



p{X,Y)y ^q{X,Y)y ^r{Y) 


(1) 


p{X,Y)yq{X,Y) 


(2) 


r{a) 


(3) 



Fig. 2 depicts two subtableau for S. Assume that the tree traverse for tableau 
construction is performed in the left-to-right and depth-first order. We can obtain 
the unit lemma p(a, a) from the left subtableau. However we can not apply this 
lemma to the subgoal a) which appears in the right subtableau at the later 
stage. 










p(b,a) q(b,a) r(a) 



•k k k 



Fig. 2. A tableau without lemma generalization 



Notice the lemma p(a, a) can immediately be generalized by investigating the 
left subtableau. The skeleton of ^p{a^ a) is the literal T), and the inference 

sequence for the left subtableau for ^p(a, a) is 
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7i : (^p(a, a), (extension, Clause (1))) 
h • a), (extension, Clause (2))) 

Is: (p(a, a), (reduction. Ancestor ^p(a, a))) 

I 4 : (^r(a), (extension. Clause (3))) 

Therefore we can construct a generalized tableau by repeating these inference 
steps, i.e., first we apply the extension to the skeleton Y) with the clause 

(1) ; second the extension to the immediate successor ^q{X^Y) with the clause 

(2) ; third the reduction with the head literal T); finally the extension with 

the clause (3). The newly obtained subtableau has the head literal a), 

where the first argument is not instantiated. a) is properly generalized 

from ^p(a, a). If we store the unit clause p(X, a) as a lemma, instead of p{a^ a), 
then we can reduce the right subtableau in Fig. 2, because the head literal a) 
can immediately be resolved with the lemma p(X, a), as depicted in Fig. 3. 




-p (a, a) 




p(a,a) -q(a,a) -r(a) 

* 




p(a,a) q(a,a) r(a) 



* 



* 



* 




~p (b,a) 

lemma ( p(b,a) ) 
* 



Fig. 3. A tableau with the generalized lemma p(X, a) 



Theorem 2. Let S be a set of clauses, C is a lemma produced by a subtableau 
Tg of S, and C' is a generalized lemma produced by gene{T^). Then C ’ is a 
logical consequence of S. Moreover C is the most proof-based general, that is, 
C is mpg(C). 



3.3 Lemma Matching Rule 

Definition 11 (Naive Lemma Rule). Given a tableau T and a set U of 
lemmas generated up to the current stage, then select a subgoal L and a lemma 
M from U, apply a tableau expansion with M to T and immediately perform a 
reduction step with L and one of its newly created successors. 

The naive use of lemmas often increases the breadth of the search space, and 
consequently causes its explosion in many cases. Any naive methods utilizing 
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lemmas often result in loss of efficiency of theorem prover. Of course, there is a 
fortunate case where search space is drastically reduced even with a naive lemma 
method. However, the most important and difficult problem of lemma methods 
is how to stabilize the effect of lemmaizing methods. 

Definition 12 (Mandatory Inference Rule). Let I be an inference rule, T 
be a tableau and T' be the result of performing the rule I to T. We say I is 
mandatory when, if there is no TME refutation from T', then no TME refutation 
from T exists. 

If an inference rule I is mandatory and is successfully applied to a tableau 
T, then we may immediately discard all other alternative inference rules to T at 
this choice point, without losing completeness. Therefore any mandatory rule I 
never causes the explosion of the search space if such an immediate cut operation 
is performed, all mandatory rules have the great ability for accelerating the 
refutation search by monotonically reducing the search space. 

Definition 13 (Unit Lemma Matching Rule [11]). Given a tableau T and 
a set U of lemmas generated up to the current stage, then select a subgoal L and 
a unit lemma A from U. If there is a substitution 0 such that AO = then 
apply a tableau expansion with Ato L and immediately perform the substitution 
0 to the newly created successor of L. 

Obviously the branch of T expanded by unit lemma matching is closed. Notice 
that the unit lemma matching is a mandatory inference rule. This rule allows us 
to use unit lemmas without causing the explosion of the search space. It has no 
need of extra pruning operations, which would be necessary for cutting off some 
redundancy caused by naive use of lemmas [11]. In this paper, we extend this 
matching rule for non- unit lemmas. 

Definition 14 (Non-Unit Lemma Matching Rule). Given a tableau T and 
a set U of lemmas generated up to the current stage, then select a subgoal L in 
T and non-unit lemma Mi V ... V A^ from U, and moreover select a literal, say 
Ml from Ml, ... , M^. If 

1. there is a substitution 0 such that Mi6^ = and 

2. there are literals iv 2 , . . . , in T such that T 2 , . . . , dominate L and each 

Li is complementary to M^6^ for i = 2, . . . , n, 

then apply a tableau expansion with Mi V . . . VM^ to L, and immediately perform 
the substitution 0 to all newly created n successors Mi, . . . , M^. 

Obviously, the new n branches containing L, which have been expanded by 
the non-unit lemma matching rule, are closed. The non-unit lemma matching is 
clearly a mandatory rule, because no substitution is performed to any variables 
appearing in the tableau T. Notice the subgoal L can be solved in the most 
general form if non-unit lemma matching is successfully applied. Non-unit lemma 
matching has the great power for reducing a search space of a TME refutation 
without a harmful side-effect. 
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4 Experimental Results 

We developed a TME theorem prover, named I-THOP, which is based on PTTP 
technology [23, 24], and has additional features of dynamic lemma generation 
and various sorts of lemma matching, such as unit lemma matching, identical C- 
reduction and strong contraction, etc [11, 13]. I-THOP participated in CASC-14 
(CADE-14 Automated theorem prover System Competition) [12, 26], and is one 
of the advanced theorem provers among the world. 

We are now developing an extended version of I-THOP. So far, we have fin- 
ished implementing some restricted forms of lemma generalization and non-unit 
lemma matching, i.e., unit lemma generalization and 2-literal non-unit lemma 
matching, respectively. We shall use this tentative version for experimental eval- 
uation of our proposals. 

We used as benchmark problems the set of eligible test problems for CASC- 
14. The eligible test problems are carefully selected from the TPTP problem 
library [25] in order to distinguish the theorem proving features of the com- 
petitors of CASC-14, all of which are state of the art theorem provers, such as 
OTTER [17], SETHEO [10, 18] etc. Thus these test problems are rather diffi- 
cult and high-level problems. The eligible problems consists of 420 unsatisfiable 
clausal theories, more precisely 234 Horn problems and 186 non-Horn problems. 
We ran an extended version of I-THOP for these test problems on SGI 02 work- 
stations (R10000/174MHz CPU and 96MB memory). 

Table 1 shows the performance of unit lemma generalization. The column 
“I-THOP” is the result of I-THOP,^ and “ULG” is for the extended I-THOP 
involving unit lemma generalization. The line “Solved Problems” indicates the 
number of problems which can be solved in the limited time 600 CPU seconds. 
“CPU time” shows the average CPU time (seconds) of all problems commonly 
solved by I-THOP and ULG.^ “Inferences” also gives the average number of the 
inference steps needed for finding a refutation of commonly solved problems. 



Table 1. Results of unit lemma generalization 





Horn 


1 Non-Horn 




I-THOP 


ULG 


I-THOP 


ULG 


Solved Problems 


105 


109 


82 


87 


CPU time (sec) 


82.7 


63.2 


30.2 


28.5 


Inferences 


94522.5 


82797.0 


40334.2 


39833.6 



^ Throughout this paper, I-THOP and its extended version always use the best lemma 
matching mode, i.e., unit-lemma matching, identical C-reduction and strong contrac- 
tion and so on, which was previously developed in [11]. 

^ Indeed, all problems solved by the original I-THOP can also be solved by the 
extended I-THOP. This shows the lemma generalization has no harmful side effects. 
Exactly, commonly solved problems are 105 problems of Horn, and are 82 problems 
of non- Horn. 
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Unit lemma generalization has the great effect for Horn problems. The av- 
erage CPU time of ULG is 25% faster than the one of the original I-THOP. 
Additionally, four and five problems can newly be solved in Horn category and 
non- Horn category, respectively. 

Table 2 shows the statistics of unit lemma generalization and its matching. 
“Total Lemmas” specifies the total number of unit lemmas generated during the 
entire refutation search processes for all test problems. “Generalized Lemmas” 
shows the number of unit lemmas properly generalized. “Failure” indicates the 
one of unit lemmas not properly generalized. 

“Not- Supported Lemmas” shows the numbers of unit lemmas whose sub- 
proofs are constructed with some extra inference rules, such as C-reduction or 
Strong Contraction [11, 13]. These additional inference rules are valuable for 
solving non- Horn problems [11, 13], but unfortunately are not supported by 
the current implementation for unit lemma generalization. “Lemma Matchings” 
indicates the average numbers of the success of unit lemma matching for all 
problems, and “Matchings with GUL” shows those of lemma matching using 
properly generalized unit lemmas. 



Table 2. Statistics of Generalization of unit lemmas 





Horn 


Non-Horn 


Total Lemmas 


14502 


13158 


Generalized Lemmas 


6406 (44.2%) 


1859 (14.1%) 


Failure 


8096 (55.8%) 


11206 (85.2 %) 


Not-Supported Lemmas 


0 (0 %) 


93 (0.7%) 


Lemma Matchings 


547.7 


320.1 


Matchings with GUL 


294.4 (53.4%) 


221.0 (69.0%) 



Unit lemma generalization is quite valuable for Horn problems. We can prop- 
erly generalize 44% of generated unit lemmas in Horn problems. Moreover 54% 
of unit lemmas used in lemma matching are properly generalized lemmas. How- 
ever non-Horn problems seem to have different features than Horn problems. 
Only 14% of unit lemmas can be properly generalized, but almost 70% of unit 
lemma matchings utilize some of the generalized lemmas. 

I-THOP always retains the inference sequence performed for the current ten- 
tative tableau in a bit sophisticated way. Thus, at any lemma generation phases, 
we can immediately extract, from the entire sequence, an inference subsequence 
of constructing the subtableau of the target lemma. Furthermore a generalized 
inference sequence can be reconstructed within a linear time, using an indexing 
of inference rules. The experiment clarified that the overhead of the unit lemma 
generalization task is only less than 2% of the entire inference time, and thus is 
negligible. The details are omitted here due to space. 

Table 3 shows the performance of the 2-literal non-unit lemma matching rule 
applied for 186 non-Horn test problems. The column “2L-mat” is the one of 




174 



Koji Iwanuma and Kenichi Kishino 



an extended I-THOP using 2-literal lemma matching. “2L-naive” is the result 
with the naive use of 2-literal lemmas. The last “2L-mat & ULG” indicates the 
performance of for an integrated version of 2-literal lemma matching and unit 
lemma generalization methods. 

The meanings of “Solved Problems” , “CPU time” and “Inferences” are sim- 
ilar to those of Table 1. The numbers listed in “CPU time” and “Inferences” are 
the average numbers of those obtained from commonly solved 63 problems, re- 
spectively. “Used 2L-lemmas” indicates the average number of 2-literal lemmas 
used in each refutation search. 



Table 3. Results of 2-literal lemma rules 





I-THOP 


2L-mat 


2L-naive 


2L-mat & ULG 


Solved Problems 


82 


88 


64 


93 


CPU time (sec) 
Inferences 
Used 2L- lemmas 


10.7 

15154.2 


10.9 
14251.5 

13.9 


21.3 

23256.4 

7235.8 


11.2 

14239.6 

13.9 



This experimental result clearly shows that the naive use of 2-literal lemmas 
is quite harmful because there is a tendency of increasing the numbers of al- 
ternatives of each inference step in refutation search. The naive use of 2-literal 
lemmas often causes search space explosion. Contrasting with it, 2-literal lemma 
matching is quite stable and beneficial for solving more problems in a limited 
time. Moreover, two methods proposed here, unit lemma generalization and 2- 
literal lemma matching, can much cooperatively work with each other. The total 
number of solved problems rises to 93 problems by integrating them. 

5 Conclusion 

In this paper, we studied the lemma generalization based on the subproof. Also 
we gave the non-unit lemma matching rule for developing a new controlled use 
of non-unit lemmas. Moreover we showed good performances of some restricted 
forms of these methods, i.e., unit lemma generalization and 2- literal non- unit 
lemma matching, throughout experiments. We emphasize that the lemma gen- 
eration of this paper is achieved during a refutation search process, dynamically 
and repeatedly. Thus lemma generalization should also be performed dynami- 
cally and repeatedly. The overhead of lemma generalization task was verified to 
be quite little and negligible, throughout an experiment. So far, the cooperative 
integration of dynamic lemma generation, real-time lemma generalization and 
the global use of non- unit lemmas has never been studied, to our knowledge, 
from a unified viewpoint. 

Non- unit lemma generalization seems to have a great ability for making TME 
refutations much shorter. Our future work is an experimental research for the 
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general forms of non-unit lemma generalization and non-unit lemma matching 
rule. 
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Abstract. The Coq and ProPre systems show the automated termina- 
tion of a recursive function by first constructing a tree associated with 
the specification of the function which satisfies a notion of terminal prop- 
erty and then verifying that this construction process is formally correct. 
However, those two steps strongly depend on inductive principles and 
hence Coq and ProPre can only deal with the termination proofs that 
are inductive. There are however many functions for which the termina- 
tion proofs are non-inductive. In this article, we attempt to extend the 
class of functions whose proofs can be done automatically a la Coq and 
ProPre to a larger class including functions whose termination proofs are 
not inductive. We do this by extending the terminal property notion and 
replacing the verification step above by one that searches for a decreasing 
measure which can be used to establish the termination of the function. 



1 Introduction 

Termination is an important property in the verification of programs defined 
on recursive data structure that use automated deduction. While the problem 
is undecidable, several proof methods of termination of functions have been de- 
veloped using for instance formal proof methods in functional programming or 
orderings of rewriting systems. For instance we mention polynomial interpreta- 
tions [3,9,21], recursive path orderings [14] and Knuth-Bendix orderings [7,12]. 
The latter methods are characterized by orderings called simplification order- 
ings [5,18] and deal with the termination of functions called simply terminating 
functions. Some functions that are non-simply terminating can be proven to ter- 
minate with methods based on structural inductive proofs because they focus 
on recursive functions which can be viewed as sorted eonstruetor systems that 
allow reasoning on the orderings’ structure of the data objects. 

But there are other recursive functions that are non-simply terminating for 
which inductive methods fail to prove the termination since precisely the struc- 
tural orderings on the terms of the algebra cannot be used. These functions, 
which we call, non-induetive- simply terminating funetions form an important 
class of functions used in recursive data structures and hence, automatically 

^ Supported by EPSRC GR/L15685. 

P.S. Thiagarajan, R. Yap (Eds.): ASIAN’99, LNCS 1742, pp. 177-189, 1999. 

(c) Springer- Verlag Berlin Heidelberg 1999 
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establishing their termination is an important property. This is witnessed by 
the recent literature where techniques coming from rewriting [6, 1,2, 8] have been 
proposed to automate the termination of such functions. 

We are interested in automating the termination (inductive or non-inductive) 
of recursive functions in a theorem proving framework. We choose the framework 
of the system called ProPre developed in [17,15,16] which is also the one used in 
Coq [4]. ProPre is devoted to the termination of recursively defined functions. 
The Coq and ProPre systems show the automated termination of a recursive 
function by first constructing a tree associated with the specification of the 
function which satisfies a notion of terminal property and then verifying that the 
process of constructing such a tree is formally correct. The search of such trees, 
from which it is also possible to extract decreasing measures [19,10] through the 
recursive call of the function, relies in particular on the structure of multi-sorted 
algebras and hence automated termination proofs in Coq and ProPre strongly 
depend on inductive principles. This means that Coq and ProPre can only deal 
with the termination proofs that are inductive. 

Our aim is to extend the Coq and ProPre approaches to deal with automated 
non-inductive termination proofs. To do this, we introduce a new notion of ter- 
minal state property which has an algorithmic content that enables the method 
to be automated as an inductive one. From each tree that enjoys the terminal 
state property we associate an ordinal measure that couldn’t be previously ob- 
tained from the ProPre system [19,10] and we show the decreasing property that 
ensures in this way the termination of the recursive function. As a consequence, 
the technique allows inductive methods to go further in the proof search when 
natural structural orderings are not enough to achieve the proof. 

The paper is divided as follows: In Section 2, we set out the formal machinery. 
In Section 3, we introduce ProPre notion of terminal state property and our own 
extension of it. We show that our extension strictly includes the ProPre notion 
and establish in Theorem 1 that if a distributing tree A has the terminal state 
property in the system ProPre, then A has the new terminal state property and 
that the opposite does not hold. In Section 4, we explain how it is possible to 
define ordinal measures against trees of functions where if the ordinal measure 
decreases in the recursive call of the function, then this function terminates. 
We recall the ramified measures that come from the analysis of the ProPre 
system and we give new measures which will help in establishing terminations of 
functions where the proofs of terminations are non-inductive. Our main theorem 
of this section (Theorem 2) establishes that our new notion of terminal state 
and our extended notion of measures, enable us to establish the termination of 
functions (inductive and non inductive). 

2 Preliminaries 

2.1 Constructor Systems 

In this paper we deal with constructor systems and more precisely with sorted 
constructor systems. The following standard definitions are needed. 
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Definition 2.1. We assume a set JF of function symbols^ called signature, and 
a set S of sorts. To each function / G JF we associate a natural number n that 
denotes its arity and a type 5i, . . . , ^ s with G S. A function is 

called constant if its arity is 0. 

We assume that the set of functions JF is divided in two disjoint sets J^c and JF^. 
Functions in J^c (which also include the constants) are called constructor symbols 
or constructors and those in are called defined symbols or defined functions. 

Definition 2.2. Let A' be a set of variables disjoint from JF. We assume that 
each variable of A' has a unique sort and that for each sort s there is a countable 
number of variables in A' of sort 5. If s is a sort, F and X are respectively sets 
included in Fc U Fd and A', then T{F^X)s is the smallest set such that: 

1. every element of X of sort 5 is a term of sort 5, 

2. if ti, . . . , tn are terms of sorts 5i, . . . , 5^ respectively, and if / is a function 

of type si, . . . , ^ s, then /(ti, . . . , tn) is a term of sort s. 

If X is empty, we denote T{F^X)g by F{F)s whose elements are called ground 
terms. If the arity of c is 0, the constant term cQ is also denoted c. Var(t) denotes 
the set of variables that occur in the term t, and Fos(t) is the set of positions 
of t. If s and t are terms and g is a position of t, then the term t[s]q is the term 
t in which the term s is now at position q. 



Definition 2.3. A (sorted) equation is a pair (/, r)^ of terms I and r of a sort 5, 
which is also called rewrite rule and written I ^ r. A set of (sorted) equations 
is non overlapping iff no left-hand sides unify each other. 



Definition 2.4. A specification or constructor system F of a function / : 
5i, . . . , ^ s in JF(^ is a non overlapping set of left- linear equations 

{(ei, ei)s, . . . , (cp, e^)s} such that for all 1 < i < p, ci is of the form /(ti, . . . , tn) 

with tj G T {Fc, X)s - , j = 1, . . . , n, and e • G T {Fc U Fd, A')^. 

Definition 2.5. Let F be a specification of a function / with type si, . . . , ^ 

s. A recursive call of / is a pair (/(ti, ... , t^), /('^i, • • • , Un)) where /(ti, . . . , tn) 

is a left-hand side of an equation of S and /(i/i, . . . ,i/n) is a subterm of the 

corresponding right-hand side. 



2.2 The Term Distributing Trees 

We give some ingredients that will be needed in the next sections. A term dis- 
tributing tree of a specification is a tree whose root can be seen as an uplet of 
distinct variables, each node matches its children and each leaf corresponds to a 
left-hand side of an equation. More precisely we have: 
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Definition 2.6. Let f be a specification of a function / : si, . . . , ^ s. .4 is 

a term distributing tree for S iff it is a tree such that: 

1. its root is of the form /(xi, . . . , where Xi is a variable of sort Si^ i < 

2. each left-hand side of an equation of f is a leaf of A (up to variable renaming) 

3. each node /(ti, . . . , tn) of A admits one variable x' of a sort s' such that the 
set of children of the node is (for x'^^ . . . , are not in ti, . . . 

{/(a, • ..tn)[C{x'^, . ..x'^)/x'],C : s' & Tc}- 



Notation 2.7. Let ^ be a term distributing tree. A branch B from the root 0\ 
to a leaf Ok is denoted by (6^i, . . . , {0k-i^x'j^_^),0k where 0i is the root, Ok is 

the leaf, and for each i < /c — 1, x' is the variable x' for the node Oi in the third 
clause of Definition 2.6. 

It can be easily seen, according to Definition 2.6, that we have the following: 

Fact 2.8. Let f be a specification of a function / of type si, . . . , ^ s and 

^ be a term distributing tree for £. 

1. For each (ti , ... An) ^ '^{^c)si * ... * T{Tc)sn fhere exists one and only 
one leaf 6^ of ^ and a ground constructor substitution p such that p{0) = 
fih,... ,tn). 

2. For every branch of A from the root to a leaf {Oi^xi)^ . . . , {0k-i^Xk-i)j Ok 
and for all i <3 < /c, there exists a constructor substitution Uj^i such that 
(Jjp{0i) = Oj. The substitutions aj^i may also be written as 

We introduce here a well-founded ordering relation on the terms. 

Definition 2.9. Assume a function m on the terms ranging over natural number 
that is closed under substitutions, i.e. m{u) > m{v) implies m{a{u)) > m{(j{v)) 
for all ground substitution a. Let u^v eT (JF, for a given sort s. We say that 
u C u iff u is linear and m{u) < m{v) with Var{u) C Var{v). 



3 Generalizing the Coq Termination Procedure and the 
ProPre System 

The analysis of the termination proofs using the Recursive Definition of the 
Coq assistant and the ProPre system shows that writing a formal proof in these 
systems can be regarded as the search of a term distributing tree that enjoys a 
terminal state property. That is to say, if a formal tree for a specification of a 
function can be built having a terminal state property then the function termi- 
nates. We define in this section a new terminal state property for a term distribut- 
ing tree generalizing that of Recursive Definition procedure and ProPre. We 
first give some notations that will be used in the rest of the paper. 
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Notation 3.1. Let Al be a term distributing tree for a specification. If t is the 
left-hand side of an equation, h{t) will denote the branch in the term distributing 
tree that leads to the term t. If 6 is a branch, then will denote the leaf of the 
branch b. Note that b and b{t) may denote two distinct branches. 

If a node 0 matches a term u of a recursive call (t, u), then the substitution will 
be denoted by pe^u- 

As every function that terminates with the procedure of the Coq assistant also 
terminates with the ProPre system, we do not give here the property in the 
setting of the Recursive Definition but only for the extended version of the 
ProPre system. Note that it is actually devised in a different way from below 
in [16]. However it has been shown [11] that for each distributing tree defined 
in [16] that enjoys the terminal state property of [16] there is a corresponding 
term distributing tree of Definition 2.6 that has the following property (Defini- 
tion 3.2) which is more convenient for our purpose. 

Definition 3.2. Let Al be a term distributing tree for a specification. We say 
that A has the terminal state property {tsp) if there is an application p \ A ^ 
{0, 1} on the nodes of A such that if L is a leaf, p{L) = 0, and for all recursive 
call (t, u), there is a node {0,x) in the branch b{t) with p{0) = 1 such that 0 
matches u with pe,u{x) C ancestor {0\x') of 0 in b{t) with 

p[0') = 1, we have P9',u{x') □ 

As already mentioned, if a specification f of a function admits a term distribut- 
ing tree that has the terminal state property, then S is terminating [17]. 

The rest of this section is devoted to define a new terminal state property gen- 
eralizing the previous one. We first need to introduce fresh variables as follows. 
For each position q and sort s, we will assume there is a new variable of sort s 
indexed by q and distinct from those of A'. 

Definition 3.3. Let t be a term and g be a position. The term [tjq is defined 
as follows: {xjq = x if x is a variable, |C(ti, . . . An)jq = C(|tilg.i, . . . , Itnjq-n) 
if C e Fc, and {g{ti , . . . , tn)jq = Xq if g e Fd. 

For a term u = g{u \^ . . . , Un) and a substitution g{p\u\) will denote the term 

We introduce the following relations: For u, u in T (JF^, we will say that 

u>vifu^v with ^(b) or ^(c) or m{v) < m{u) in Definition 2.9; and we will say 
that u ^ V if u lit V with (b) and (c) and m{u) = m{v). We will use the so-called 
size measure | . for the mentioned measure m. In the following definitions of 
the section we will consider a function / : si, . . . , ^ s, a specification or a 

split specification F, and a term distributing tree A of £. 

Definition 3.4. For each node 6^, C9 will denote {b ^ A, 0 ^ b} and IZ9 the 
set of recursive call (t, u) such that b{t) G C9. If (t, u) is a recursive call, then 
Ma(u) = {b e A,3p,p' such that f{pluj) = p^{f{Lb))} and QA{t,u) = {0 e 
b{t),3a,a{f{0)) = u}. 
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Note that the set u) is not empty since the root node belongs to u). 

Let 6 be a branch and two nodes 0,0' G 6, we say that 6^ < 6^' if 6^ is closer than 0' 
to the root (i.e. if 0 is an ancestor of 6^'). So we can write u) = maxQ^(t, u). 

For each node 6^ of ^ we assume an associated subset Qe of Tie which will 
be made explicit in Definition 3.7. Notice that the definitions below should be 
given simultaneously but are introduced separately to ease the readability. 

Definition 3.5. Let {0,x) be a node of A and Qe be a subset of 71$. For each 
recursive call (t, u) of Qe such that 0 G u), we assume that one of the two 

following cases below holds and we define as follows: 

1. If pe,u{x) L or pe,u(x) > crL^^^^^g{x), then = 1, 

2. If pe,u(x) =4 aL,^,^^ 0 (x), then = 0. 

The meaning of the above definition and the following one is to give decreasing 
criteria extending those of Definitions 2.9 and 3.2. It relies in particular on the 
hierarchical structure of the trees. 

Definition 3.6. Let (0,x) be a node of A and be a subset of TZ^. For each 
recursive call (t,u) of such that 0 G and for each branch h G C^, we 

will define rfy ^ in the following way: 

1. First take all (t, u) such that pe^u{x) > ^ ^ 

^ _ Jo if 6 G Ma(u), 

if not. 

2. Next, consider each (t, u) in such that there is a (T, u') with r/y, = 0, 

and for which no r/y ^ is already defined for any b' G . Then also take 

^ _ Jo if 6 G Ma(u), 

if not. 

3. Finally if item 2 cannot be applied, put rjy ^ = 1 for each b ^ Cq. 

Notice that the cases 1 and 2 are made distinct in the above definition as the 
value is algorithmically defined; namely case 1 is the initial case. 

We define, for each node 6^ of ^ and each left-hand side t of an equation 
where 0 with b{t) G Ce, r/f = Yi ^ = 0 if not. 

(t' ,u')^Ge 
0^QA(t' ,u') 

We now explicit the subset Qe of IZq for a node 6^. The following states whether 
from each node, a recursive call can be eliminated from a set of recursive calls: 

Definition 3.7. Let 0± be the root of the recursive distributing tree A. We first 
put = IZe ^ . Now assume that Qe is defined for a node 6^ of ^ and let 0' be 
a child of 0 with 0' in A. The set Qq' is then defined as follows: {t,u) G Qe' iff 
(f,u) r\Qe and , ??* ) 7 ^ (1,1)- 
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Now, we define F which is a necessary condition for the termination statement. 

Definition 3.8. Let 6^ be a node of associated tree A of the recursive distributing 
tree A distinct from a leaf. We put F{0) = 0 if there is a child 0' of 0 and (t, u) 
in Qof such that 0 > n); and we put F{0) = 1 if not. 

Now the new terminal state property can be defined below. 

Definition 3.9. The recursive distributing tree A is said to have the new 
terminal state property if for each node 6^ of Al distinct from a leaf we have 
F{9) = 1 and for each branch h there is node 9' in h such that Qq' = 0 - 

We now come to Theorem 1 that states the above definition of the new terminal 
state property strictly includes the ProPre notion of terminal state property. 

Theorem 1. Let S he a specification of a function with a distributing tree A. 
If A has the terminal state property in the system ProPre, then A has the new 
terminal state property. The opposite does not hold. 

A crucial point is of course to make sure that the new terminal state property 
leads a function to terminate. We prove this result in the next section by showing 
the existence of measures decreasing through the recursive calls of the functions. 
Note that there exist decreasing measures coming from the formal termination 
proofs in Coq or in ProPre. But in contrast with these measures, the new one, 
with the new terminal state property, will allow one to prove the termination 
functions that usually cannot be done with inductive methods. 

4 Dealing with a Non Inductive Method 

A close notion to term distributing trees of a specification that has the termi- 
nal state property is the ramified measures. The measures coming from Coq or 
ProPre characterize in some sense the induction proofs made in the systems. 
We recall the ramified measures and explain why we need to introduce other 
measures to deal with termination that usually cannot be proven with inductive 
methods. Among these measures, a particular class is defined that is related to 
term distributing trees enjoying the new terminal state property and we show 
that they have the decreasing property. This, therefore, implies that the cor- 
responding functions terminate. As a consequence, this provides a method of 
reasoning about termination of recursive functions where the underlying proofs 
rely on non-inductive as well as inductive axioms. 

4.1 The Ramified Measures and the ProPre System 

Definition 4.1. Let Al be a tree and 9 a node of A. The height of 9 in A, 
denoted by H{9,A), is the height of the subtree of A whose root is 9 minus one. 

For a term distributing tree A, we assume that for each node 9i different from a 
leaf there is an application mi that maps on natural numbers. The general form 
of ordinal measures introduced in [19] is given by the following 
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Definition 4.2. Let f be a specification of a function / : si, . . . , ^ s, A 

be a term distributing tree for a specification of £ and cj be the least infinite 
ordinal. The ramified measure : T{Fc)si ^ ^ is defined by: 

Let t = (ti, . . . An) be an element of the domain and 0 be the leaf of A such 
that there is a substitution p with p{0) = f(t) (Fact 2.8). Let B be the branch 
. . . , {Ok-i^Xk-i),0 of A from the root to 0^ let ar^s be the substitutions 
of Fact 2.8 and for each Oi the associated application i < /c — 1. Then 

k-i 

^A{t) = * mi{p{'^k,iixi))) ■ 

i=l 

The ramified measures can be illustrated by Figures 1 and 4. An interesting 
subclass of the above measures is the class of R- measures. It has been shown that 
to each formal termination proof of recursive functions made with the Recursive 
Definition procedure of the Coq assistant, there is a distributing tree that has 
the terminal state property implying the decreasing property of the R-measure 
associated to the distributing tree [19]. This class of measures could be enlarged 
with 1-measures [10] that can be related to a more efficient version of ProPre [16]. 

The functions mi that occur in the definition of these measures belonging to 
the class of Definition 4.2 are directly supplied from formal proofs made in the 
system. This can be illustrated by Figures 2 and 3, where m is the parameterized 
function of Definition 2.9. 



Definition 4.3. The reeursive length of a term t of sort s is defined by: 



1. if t is a constant or a variable, then lg{t) = 1, 

2. if t = C(ti, . . . , tn) with C : <si, . . . ^Sn ^ s then lg{t) = 1 + 




Fig.l. Ramified measures Fig. 2. R- measures Fig. 3. I-measures, 

mi G {m,0} 

4.2 Extended Ordinal Measures 



We motivate here the definition of new ordinals illustrated with some examples. 

It is well known that the structural ordering is the most used among the 
well-founded orderings on natural numbers. As it is claimed in [20], other well- 
orderings on natural number are difficult to find automatically. As a simple 
illustration the constant function with value 0 is given in [20] that can be defined 
with the following specification £\. 
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Though a well-founded ordering is of course easy to find by a human in 
this case, it is however difficult to obtain one in an automated way since it is 
a non-simply terminating function and not suited to inductive methods. Note 
that proving at the same time the correctness of the specification (i.e. f(x) = 0) 
and the termination seems not really relevant here as this is usually done with 
an ordering. Moreover the specification of the quot function given in this paper 
clearly shows that the correctness cannot be helpful in that case. 

Note that there is no term distributing tree of Si which has the terminal 
state property, but there is one that satisfies the new terminal state property. 

The following example S 2 of the function evenodd : nat^ nat Bool is bor- 
rowed from [1]. As mentioned in [1] the modifications of mutually recursive func- 
tions to obtain a function without mutual recursion leads to such specifications 
as that of evenodd below. We assume that not is already defined. 

evenodd{x^ 0) ^ not{evenodd{x^ <^(0))) 

evenodd{0,s{0)) false (2) 

evenodd{s{x)^ s{0)) evenodd{x^0). 

The second argument is used as a flag that enables evenodd to compute 
either the even function or the odd function. This function, which is a non 
simply terminating function, cannot be proven with usual inductive methods 
since precisely there is no natural orderings that can be used. 

Consider the next example of specification of the function quot : nat, nat, 
nat nat, borrowed from T. Kolbe [13] and that can be found in [2]. 

The value of quot{x,y, z) corresponds to 1 -h when z ^ 0 and y < x, 

that is to say quot{x,y,y) computes [|J. 

quot{Q,s{y),s{z)) 0 

quot{s{x), s{y),z) quot{x, y, z) (3) 

quot{x,0,s{z)) s{quot{x,s{z),s{z)). 

The last rule shows that the specification is not simply terminating. The same 
rule also shows that the termination cannot be proven by usual inductions proofs. 

It turns out that specification functions such as (1), the evenodd function 
(2) or the quot function (3) cannot be proven by the system ProPre and no 
R- measures neither I- measures [19,10] have the decreasing property for any of 
these specifications. However, the following ordinal function 

i7(u, 0) = cc * \u\^ -h 1, i7(u, s{v)) = uj ^ 

where | • |^ is the size function, i.e. |0|^ = 1, |s(u)|^ = 1 + reflects 

the specification of evenodd in the sense that it decreases in the recursive call 
of the function. There is also an ordinal measure below that has the decreasing 
property for the specification of the quot function 

i7(u, s{v),w) = a; * \u\^, S2{u, 0,w) = uj ^ \u\^ P 1. 

It would be possible to And a decreasing measure in the class of Definition 4.2 
for (1), (2) or (3), but the choice of the nii is difficult to obtain in an automated 
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way. In particular we want to have rrii functions that are as simple as possi- 
ble, such as for instance the size functions that are found in the above ordinal. 
Furthermore we would like to relate decreasing measures to term distributing 
trees that satisfy the new notion of terminal state property generalizing those 
of Coq and ProPre. It turns out that such suitable measures actually belong to 
the extended ordinal measures defined below. We first introduce the following 

Definition 4 . 4 . Let f be a specification of a function f : si, ^ Sn ^ s such 
that there exists a term distributing tree ^ for S. For each node Oi of we will 
assume that there are associated applications . . . , rriij. that map on natural 
numbers whose number is equal to the number of the sub-branches starting from 
the node Oi. Note that this number may be distinct from the number of the 
children of the node. These applications will be called node measures. If 6^ is a 
leaf of a branch where Oi appears, we will also use me^^e to make explicit one of 
the node measures of Oi when necessary. 

Definition 4 . 5 . Let f be a specification of a function f : s±, . . . , Sn ^ s such 
that there exists a term distributing tree A ior £. The extended measure 
: T{Fc)si >!^ . . . * T {Fc)sn ^^7 is defined as follows: 

Let t = (ti, . . . An) be an element of the domain and 0 be the leaf of A such 
that there is a substitution p with p{ 0 ) = f{t) (Fact 2 . 8 ). Let B be the branch 
(6^1, xi), . . . , {Ok-i^Xk-i ),0 of A from the root to 6^, let be the substitutions 
of Fact 2 . 8 . Then 



k-i 

i=l 

An extended measure can be illustrated by Figures 5 and 6. 



^0 



mo 




Li 





L2 Ls 

Fig.4. Term distributing with 
ramified measure 




L2 Ls 

Fig.5. Term distributing with 
extended measure 



For instance the specification £i with the equations (1) in Section 4.2 admits 
a term distributing tree with an extended measure defined as follows 
12 ^(0) = u; * |0|#, I2^(s(0)) = |0|#, I?^(s(s(w))) = 0. 

This measure has obviously the decreasing property in the recursive calls of Si . 
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One may wonder whether the automation of decreasing measures belonging 
to Definition 4.5 is possible since we have to take account of the rrii^j applications. 
We will show that it will be enough to consider a subclass of measures, called 
hole-measures^ generalizing R- and I-measures. These measures will be associated 
to term distributing trees enjoying the new terminal state property. We will show 
that they have the decreasing property and that functions which admit a term 
distributing tree with the new terminal state property, are therefore terminating. 



4.3 The Hole-Measures 

Definition 4.6. Let f be a specification of a function / : si, . . . , ^ s such 

that there exists a term distributing tree Al for f . The hole measure 
: T{Fc)si ^ {Fc)s^ is defined as follows: 

Let t = (ti, . . . ^tn) be an element of the domain and 0 be the leaf of A such 
that there is a substitution p with p{0) = f(t) (Fact 2.8). Let B be the branch 
(6^1, xi), . . . , of A from the root to 6^, let be the substitutions 

of Fact 2.8. Then 



k—l 

i=l 

O' 

That is to say * | • I#- 

Note that, due to the relation between the leaf 0 and the term t, * 
depends both on Oi and 0 in the above definition. 




Now we come to the main theorem of this paper. It states that our new notion 
of new terminal state property and our extended notion of measures enable us 
to establish the inductive and non inductive termination of functions. 

Theorem 2. Let S he a specification of a function f : 5i,... ^ s and 

A he a distributing tree A for S having the new terminal state property. The 
assoeiated measure satisfies the decreasing property. I.e.^ for each recursive 
eall (/(ti, . . . , tn), /(ui, . . . , Un)) of S and ground eonstruetor substitution p we 
have: Q . ,</?(tn)) > C^(‘p([wi]l), • • • , <y?([Wn]n))- 
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5 Conclusion 

In this paper we have proposed a method that extends the automation of the 
proofs of termination of recursive functions used in ProPre and Coq. Whereas 
Coq and ProPre could only deal with the automation of inductive proofs, the 
method allows the automation of a larger class of recursive functions because 
non structural orderings can be handled by the method. The method is also 
a good vehicle for extending the automation of termination proofs of recursive 
functions to deal with issues not yet incorporated in theorem provers. 
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Abstract. This paper aims at introducing an extension of M-nets, a 
fully compositional class of high-level Petri nets, and of its low-level 
counter part, Petri Boxes Galculus (PBG). We introduce a new operator 
with nice algebraic properties which allows to express asynchronous com- 
munications in a simple and flexible way. With this extension, asynchro- 
nous communications become at least as simple to express as (existing) 
synchronous ones. Finally, we show how this extension can be used in 
order to specify systems with timing constraints. 



Keywords: Petri Net, Petri Box Calculus, M-Net, Semantics, Timed Spec- 
ification. 

1 Introduction 

M-nets, constructed at the top of the algebra of Petri boxes [4,3,13], are a fruit- 
ful class of high-level Petri nets which nicely unfold into low-level nets and thus 
allow to represent large (possibly infinite) systems in a clear and compact way. 
They are widely used now as a semantic domain for concurrent system speci- 
fications, programming languages, or protocols, cf. [6,7,11,1,14,2,12]. The most 
original aspect of M-nets with respect to other high-level net classes is their full 
compositionality, thanks to their interfaces, and a set of various net operations 
defined for them. Their interest is augmented by the ability to use in practice 
an associated tool, PEP [5], which also offers various implemented verification 
and analysis methods. 

This paper defines a possibility to express asynchronous communication links 
at the M-net and PBC algebras level by introducing a new operator (tie). This 
extension mainly concerns the labelling of transitions, and now, in addition to 
usual synchronous communications (with the synchronisation operator sy), tran- 
sitions may also export or import data through asynchronous links. It turns out 
that the tie operator has nice algebraic properties: idempotence, commutativity 
with itself and with the synchronisation, and also coherence (z.e., commutativ- 
ity) with respect to the unfolding. These properties allow to preserve the full 
compositionnality of the model. 

As an application, we present a modeling of discrete time constraints in the 
M-nets. This allows to specify time-constrained systems and to analyse their 
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unfoldings with existing tools PEP). We use a high-level featured clock 

which handles counting requests for the rest of the system. Asynchronous links 
are used to perform the necessary communications between the actions related 
to each request. 

The next three sections briefly introduce M-nets and PBC, including the 
basis for our extension. Then, sections 5 and 6 give the deflnitions and the 
algebraic properties of the tie operation. Section 7 is devoted to the application 
of asynchronous links to discrete time modelling. 

2 Basic Definitions 

Let E be a set. A multiset over E is a function /x : E ^ IN ; // is flnite if 
{e G E \ ji{e) > 0} is flnite. We denote by A4 f{E) the set of flnite multi-sets 
over E, by 0 and 0 the sum and difference of multi-sets, respectively. 0 is used 
to relate an element of E to the number of its occurences in a multi-set over 
E] in particular, a multi-set /x over E may be written as 0g^^/x(e) 0 e, or in 
extended set notation, e.^., {a, a, 6} for /x(a) = 2, /x(6) = 1 and /x(e) = 0 for all 
e G E \ {a, 6}. We may also write e G /x for /x(e) > 0. 

Let Val and Var be flxed but suitably large disjoint sets of values and vari- 
ables^ respectively. The set of all well- formed predicates built from the sets Val^ 
Var and a suitable set of operators is denoted by Pr. 

We assume the existence of flxed disjoint sets Ah of high-level action symbols 
(for transition synchronous communications) and B of tie symbols (for transition 
asynchronous links). We assume that each element A e Ah has an arity ar(A), 

and that there exists a bijection ^ on Ah (called conjugation)^ with A = A, 
A 0 A and ar(A) = ar(A). We also assume that each element 6 G B has a type 
type{b) C Val. 

A high-level action is a construct A(ai, . . . , where A ^ Ah (notice 

that we could have A instead of A) and Oi G Val U Var (1 < x < ar(A)). A 
typical high-level action is, for instance, A(ai,a2,5) where A e Ah and ai, 02 
are variables. The set of all high-level actions is denoted by AXh. 

Similarly, a low-level action is a construct A(ui, . . . ,Un) G AX/j,, where Vi G 
Val for all X G {1, . . . , rx}. The set of all low-level actions is denoted by A/. As for 
high-level case, we assume that A\ is provided with a bijection^, with analogous 
constraints; moreover, we will write A(ui, . . . , Var{A)) instead of A(ui, . . . , Var{A))- 

A high-level link over 6 is a construct 6^ (a), where 6 gB, dG{0,— }isa link 
direction symbol^ and a G Val U Var. The set of all high-level links is denoted 
by Bh and the set of all low-level links is denoted by B/ = {b^{v) | 6 G B,d G 
{ 0 , — u G Val}. 

The deletion is deflned for a multi-set of links fd G A4f[Bh) and a tie symbol 
6 G B, as 

pde\h = p&{ 0 I 

\ieJVI f ({b^ (a)\d^{-\- , — } ,a^VarUVal}) 

and analogously for low-level links. 
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A binding is a mapping a: Var Val and an evaluation of an entity rj (which 
can be a variable, a vector or a (multi-) set of variables, a set of predicates, a 
(multi-)set of high-level actions, etc.) through a is defined as usual and denoted 
by rf[a]. For instance, if a = (ai i-^ 2, U 2 3), the evaluation of high-level action 
A(ai,a2,5) through a is the low-level action A(2,3,5). Similarly, the high-level 
link b~^{ai) evaluates through a to the low-level link b~^{2) and the predicate 
ai = 2 to true. 



3 Petri Boxes and M-Nets 

Petri Boxes are introduced in [4,3,6] as labeled place/transition Petri nets satis- 
fying some constraints, in order to model concurrent systems and programming 
languages with full compositionality. 

Definition 1. A (low-level) labeled net is a quadruple L = (S', T, VF, A), where 
S is a set of places, T is a set of transitions, sueh that S H T = 0^ VF : 
(S X T) U (T X S) ^ IN is a weight function, and X is a funetion, ealled the 
labeling of L, sueh that: 

— Vs G S: A(s) G {e, i,x} gives the place status (entry, internal or exit, respec- 
tively); 

— Vt G T:X(t) = a(t).(3{t) gives the transition label, where a{t) G A4/(A/) and 

The behavior of such a net, starting from the entry marking (just one token 
in each e-place), is determined by the usual definitions for place/transition Petri 
nets. 

M-nets are a mixture of colored net features and low-level labeled net ones. 
They can be viewed as abbreviations of the latter. 

Definition 2. An M-net is a triple (S, T, t), where S and T are disjoint sets of 
places and transitions, and i is an inscription function with domain S U (S x 
T) U (T X S) U T such that: 

— for every place s G S, t{s) is a pair A(s).r(s), where A(s) G {e, i,x} is the 
label of s, and r(s) C Val, is the type of s; 

— for every transition t ^ T, i{t) = A(t). 7 (t), with X(t) = a(t).(3{t), the label 

of t, where a(t) G Aif{AXh) is the action label and (3(t) G is the 

link label; j{t), the guard oft, is a finite set of predicates from Pr; 

— for every arc (s,t) G (S x T) : t{{s,t)) G Mf{Val U Var) is a multiset of 
variables or values (analogously for arcs (t, s) G (T x S) ). 

t{{s,t)) will generally be abbreviated as t{s,t). 

A marking of an M-net {S,T,i) is a mapping M\S Aif{Val) which as- 
sociates to each place s G S a multi-set of values from r{s). In particular, we 
shall distinguish (like for low-level nets) the entry marking, where M(s) = r(s) 
if A(s) = e and the empty (multi-)set otherwise. 
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The transition rule specifies the circumstances under which a marking M' 
is reachable from a marking M. A transition t is enabled at a marking M if 
there is an enabling binding a for variables in the inscription of t (making the 
guard true) and in arcs around t such that Vs G 5 : t{s^t)[a] < M{s)^ z.e., 

there are enough tokens of each type to satisfy the required flow. The effect of 
an occurrence of under an enabling binding a, is to remove tokens from its 
input places and to add tokens to its output places, according to the evaluation 
of arcs’ annotations under a. 

The unfolding operation associates a labeled low-level net (see e.g. [4]) U{N) 
with every M-net as well as a marking U{M) of U{N) with every marking 
M of TV. 

Definition 3. Let N = (S', T, t); then U{N) = (ZY(S), TY(T), W, A) is defined as 
follows: 

- U{S) = I s G S and v G t{s)}, 
and for eaeh Sy G U{S) : A(s^) = A(s); 

- U{T) = {ter I T G T and a is an enabling binding oft}, 
and for eaeh t^ gU{T) : A(t^) = \{t)[a]; 

- W{sy,ta) = T.a€o{s,t) A a[< 7 ]=t . analogously for WW,Sy). 

Let Tkf be a marking of TV. U{M) is defined as follows: for every place 
Sy G U{S), iU{M)) (s^) = {M{s)) (u). Thus, each elementary place Sy G U{S) 
contains as many tokens as the number of occurrences of v in the marking M(s). 

4 Box and M-Net Algebras 

The same operations are defined on both box and M-net levels. They can be di- 
vided in two categories: the control flow ones and the communication ones. The 
first group, which consists of sequential (;) and parallel (||) compositions, choice 
(D) and iteration ([**]), can be synthesized from refinement meta-operation [8,9] 
and they will not be concerned by our extension. The second group concerns the 
operations which are based on transition composition, and will be directly con- 
cerned here, so we introduce them with some details. Only low-level operations 
are defined formally while we give some intuition for the high-level (M-net) oper- 
ations. We illustrate the low-level synchronization and restriction on an example 
and we refer to [6] for further illustrations. 

A synchronization LsyAi , with Ai G A/, adds transitions to the net T, and can 
be characterized as CCS-like synchronization, extended to multi-sets of actions. 
Intuitively, the synchronization operation of an M-net consists of a repetition 
of certain basic synchronizations. An example of such a basic synchronization 
over low-level action A(2,3) of the (fragment of) net L is given in figure 1. 
Transitions ti and t 2 which contain actions A(2,3) and A(2,3) in their labels 
can be synchronized over A(2, 3) yielding a new transition (^ 1 ,^ 2 ). The repetition 
of such basic synchronizations over A(2, 3), for all matching pairs of transitions 
(containing A (2, 3) and A(2, 3)), yields the synchronization of a net over A(2, 3). 
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M(2,3)}. 

{&+(2)}J 

L (fragment) 






Fig. 1 . Synchronization and restriction in PBC. 



In M-nets, the actions A{a\^a 2 ) and a^) can synchronize through renaming 

and unification of their parameters. 

Definition 4 . Let L = (S', T, IF, A) he a low-level net and Ai e Ai a low-level 
aetion. The synehronization L sy Ai^ is defined as the smallest^ low-level net 
L' = (S', T', IF', A'); satisfying: 



- S' = S,T'D T, and W^'|(sxT)u(TxS) = W; ^ 

— if transitions t\ and t2 of L' are such that Ai G oc' {ti) and Ai G a'{t2), then 
L' contains also a transition t with its adjacent arcs satisfying: 

• A'(t) = ^o;'(ti) 0 ot'[t2) 0 {Ai^ ® ; 

• Vs G S':W'{sfi) = W'{sfii)(BW'{sfi2) 
and W'ifi s) = W'{ti,s) 0 W'{t2, s). 

The lowest part of figure 1 shows the restriction of L sy H(2,3) over the 
action H(2, 3) which returns a net in which all transitions whose labels contain 
at least one action A (2, 3) or A (2, 3) are deleted (together with their surrounding 
arcs). The synchronization followed by the restriction is called scoping: [a : T] = 
{L sy a) rs a, for an action a. 



^ with respect to the net inclusion, and up to renaming of variables. 
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e.{l, 2 }n 

\{ai} 

{A(ai,a2)}.{5^(ai)}.0 

pi} 

x.{l,2}0 



N 



e.{l} 

{1} 

i.{l} 

{ 1 } 

f 0.{5-(a3)}.0 

^3> 

Ox.{l,2} 




Fig. 2. High-level tie operation. 



5 Asynchronous Link Operator: Tie 

In this section, we introduce a new M-net algebra operator, tie, devoted to 
express asynchronous links between transitions. We give first an example in the 
high-level, and then, define it formally in both high and low-levels. 

In figure 2, operator tie takes an M-net N and a tie symbol b (we assume that 
type{b) = {1,2}), and gives an M-net Ntieb which is like N but has an additional 
internal place of the same type as 6, and additional arcs between and 
transitions which carry in their label (high-level) links over b. The inscriptions 
of these arcs are (multi-) sets of variables or values corresponding to links over 
6, and the labels of concerned transitions are as before minus all links over b. 
For instance, the arc from t\ to is inscribed by {ai} because there is a link 
6+(ai) in the link label of ti, which means that the variable a\ has to be exported 
through b. 

Definition 5. Let N = (S', T, i) be an M-net and 5 G B a tie symbol N tie b is 
an M-net N' = (S',Tp') sueh that: 
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- S' = S Hi {sb}, and Vs e S': V(s) = | ® 

^ I otherwise] 

- T' =T and \/t € T':V(t) = a(t).{/3(t) del b).j{t), 
if i{t) = 

- \/s G S' and Vt € T' we have: 



• V(s,i) 



• d{t,s) 



0 /3{t){b (a)) ©a if s = Sb, 

aEVarUVal 

t(s, t) otherwise, 

0 /3(t)(6+(a)) 0 a ifs = Sb, 

aEVarUVal 

i{t, s) otherwise. 



The tie operation in the low-level is defined similarly. In that case, a place is 
created for each value in type{b) and arcs are added accordingly. 



Definition 6 . Let L = (S', T, IT, A) be a low-level net and b e B a tie symbol. 
L tie b is a low-level net L' = {S' W' , X') sueh that: 



- S" = S' y {Sb,v I V G type{b)}, and Vs e S': A'(s) = | ’ 

- T' =T andytGT':X'{t) = a{t).{/3{t) delb), if X{t) = a{t)./3{t); 

- Vs € S',\/t G T' and Vv € type{b) we have: 



• W'{s,t) 



• W'{t,s) 



m(b-{v)) ifs = Sb,.GS'\S, 

v^Val 

W{s,t) otherwise, 

^ /?(t)(6+(C) ifs = Sb,v&S'\S, 

v^Val 

W{t,s) otherwise. 



6 Properties 

Theorem 1 . Let L be a low-level net, Ai G A/ and b\, 62 G B. Then: 

1. (L tie 61 ) tie 61 = L tie 61 (idempotence) 

2. {L tie 61 ) tie 62 = {L tie 62 ) tie 61 (eommutativity) 

3. {L tie 61 ) sy Ai = {L sy Ai) tie b\ (eommutativity with 

synehronization ) 

Proof. 1. By definition 6, operation tie makes desired links and removes con- 
cerned tie symbols from the transitions labels. A second application of tie 
over the same tie symbol does not change anything in the net. 

2. By definition 6, operations tie for different tie symbols are totally indepen- 
dent, the order of applications has no importance. 




Asynchronous Links in the PBC and M-Nets 197 



3. By definition 6 and 4, when tie is applied first, it creates arcs which are 
inherited by the new transitions created by sy. Conversely, if sy is applied 
first, it transmits links to the new transitions, allowing tie to create the 
expected arcs. 

Since operation tie is commutative, it naturally extends to a set of tie sym- 
bols. 

Theorem 2 . Let N he an M-net, and 6 G B. Then: 

U{N tie h)=U{N) tie 6 . 

Proof. It is enough to remark that the high-level tie operation creates a place 
Sb with type type{b) and adds arcs to/from transitions which carry links on 
b in their inscriptions. The unfolding gives for this new place a set of places 
{sb,v I ^ and the set of arcs to/from transitions The weight of an 

arc between place Sb,v and transition to- corresponds to the multiplicity of value 
V in the evaluation through a of the high-level arc inscription between Sb and t. 
By definition, this is exactly what is done by the low-level tie operation. 

Now, corollary 1 comes directly from the two above theorems and from the 
commutativity of synchronization with unfolding [ 6 ] (notice that, after the in- 
troduction of links, it is obvious that this commutativity is preserved). 

Corollary 1. Let N be an M-net, Ah G Ah and b\, 62 G B. Then: 

1. {N tie bi) tie bi = N tie bi (idempotence) 

2. (TV tie bi) tie 62 = {N tie 62 ) tie 61 (commutativity) 

3. {N tie 61 ) sy Ah = {N sy Ah) tie b\ (commutativity with 

synchronization ) 

where = identifies M-nets which are equivalent modulo renaming of variables 

[ 6 ], 

7 An Application to Discrete Time Modeling 

The newly introduced tie operator has an immediate application in a modeling of 
discrete time within M-nets. A clock is built over principles described in [10,15]: 
an arbitrary event (z.e., a transition occurrence) is counted and used to build 
the time scale which all the system refers to. Two points can be puzzling: the 
time scale is not even and the clock can be frozen to allow the system meeting its 
deadlines. Both these points are discussed in [10] and are shown to be irrelevant 
in the context of Petri nets used for specification (by opposition to programming). 

Our clock (which actually is a server) is implemented by the M-net Ndock 
depicted in figure 3. It can handle concurrent counting requests for different parts 
of the system: each request is assigned an identifier which is allocated by the 
clock on a start action; next, this identifier can be used to perform check actions 
which allows to retrieve current pulse-count for the request; finally, a stop can 
delete the request, sending back the pulse- count again. 
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0.0. {Cg dq \ q €Q} 




Fig. 3. The M-net Ndock’- the place i is labeled \.Q x {_L, T} x IN x IN . 



The clock N clocks depicted in figure 3, works as follows: 

— it starts when the transition to fires and put tokens in the central place of 
N clocks each token being a tuple (g, 6 , d, c), corresponding to a request, where 
g G Q is the identifier for the request, h G {_L, T} tells if the request is idle 
{h = T) or in use {h = T), d is the maximum number of pulse to count 
for the request and c is the current pulse- count. Both d and c are values in 
IN = IN U a; where uj is greater than any integer and a; + 1 = a;; 

— a request begins when ti fires, start has two parameters: the request identi- 
fier, g, is sent to the client which provides the duration, d, as the maximum 
pulse-count for the request; 

— next, the current pulse-count can be checked with transition ^ 2 : the identifier 
is provided by the client which gets the current pulse-count returned back 
through c; 

— ts is used to stop a request, it acts like t 2 but also sets the request’s token 
idle; 

— the clock can be terminated with transition ^ 4 ; 

— at any time, provided its guard is true, pulse can fire, incrementing the pulse- 
count for all requests. This explains the idle value (g, _L, 0,o;) for the tokens 
in it does neither change after a pulse since uj = uj-\- 1 nor affect the guard 
on pulse because uj ^ 0. This guard is used to prevent pulse from firing if 
a request is about to end (pulse-count is equal to the maximum announced 
on start) ^ this ensures a timed sub-net will always meet its deadline if it has 
been announced on start. 

In order to use this clock, one just has to add “clock actions” {starts check and 
stop) on the transitions intended to be timed and asynchronous links should be 







Asynchronous Links in the PBC and M-Nets 199 



{start{q' ,cj), {stop{q, c), stop{q' , c')}. 

{start{q,3)}. check{q, c),write{c)}. (q')}- 

{ht{q)}.0 {K {q),ht{q),ht{q')}.0 {c' = 1} 

O ^ ^[3 — ^ 

e.{.} i.{.} i.{.} x.{.} 

Fig. 4. An example of time-constrained M-net. 



used to transport the request identifiers between a start and the corresponding 
check{s) and stop{s), like in figure 4. 

In the net of figure 4, we specify the following constraints: 

— the start{q^ 3) on t^ corresponds to the stop{q^ c) on (thanks to the links 
on hi) so there can be at most 3 pulses between t^ and tr because of the 
guard on pulse; 

— similarly, there must be exactly 1 pulse between tQ and tr (here we use the 
links on /i 2 to transmit the identifier) but the constraint is expressed through 
the guard on tr. In this case, since the deadline was not announced in the 
start, it is possible to have a deadlock if the system runs too slow, z.e., does 
not meet its deadline; 

— on te, we use a check for the request started on t^ (the identifier is imported 
and exported on hi) to retrieve the pulse count between t^ and tQ. This count 
is sent to another part of the system with the action write{c) which should 
be synchronized with a write ^ in a piece of the specification not represented 
here. 

8 Conclusion 

We presented an extension of M-nets and PBC in order to cope with asyn- 
chronous communications at the (net) algebra level. This extension led to a 
simple and elegant mean to model discrete time within M-nets, so we hope it 
would be useful to work with timed specifications. Moreover, we could expect 
to apply asynchronous links for a wider range of applications. In particular, we 
yet study: real-time programming with B(PN)^ [7] (a parallel programming lan- 
guage with M-nets and PBC semantics), discrete timed automata simulation 
and a new implementation of B(PN)^’s FIFO channels. 
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Abstract. We propose a demand-driven model-checking algorithm, 
which decides the alternation-free modal mu-calculus for eontext-free 
processes. This algorithm enjoys advantages known from local model 
checking in that it avoids the investigation of certain irrelevant parts of 
a process, and simultaneously improves on its classical counterpart of 
[5] in that it avoids the computation of irrelevant portions of property 
transformers. In essence, this algorithm evolves from combining the spirit 
of second-order model checking underlying the algorithm of [5] with the 
idea of demand-drivenness developed in the field of interprocedural data- 
flow analysis. Though the new algorithm has the same worst-case time 
complexity as its counterpart, we expect a substantial performance gain 
in practice because its demand-drivenness reduces the computational ef- 
fort of those parts, which are responsible for the exponent iality of the 
classical second-order algorithm. 



1 Motivation 

Model checking (cf. [7, 10, 16]) has proved to be a powerful and practically 
relevant means for the automatic verification of behavioral systems (cf. [8, 15, 
19]). Particularly ambitious are approaches aiming at model checking of infinite 
state systems, which has been pioneered by approaches for local model checking 
(cf. [20, 21]). The point of local model checking is to avoid the investigation of 
parts of a process, which are irrelevant for the verification of the property under 
consideration. Straightforward applications to infinite state systems, however, 
are generally not effective because they cannot guarantee termination (cf. [3, 4]). 

The construction of effective algorithms, however, becomes possible as soon 
as one restricts one’s attention to specific classes of infinite state systems as 
e.g. context-free (cf. [5, 12]) or pushdown process systems (cf. [2, 6, 22]). In 
this article we focus on context-free process systems {CFPSs) and reconsider the 
classical algorithm proposed by Burkart and Steffen for this setting (cf. [5]) . Their 
algorithm decides the alternation- free modal mu-calculus for context-free process 
systems (CFPSs), i.e., for processes which are given in terms of a context-free 
grammar. The central idea underlying their algorithm is to raise the standard 
mo del- checking techniques to second order: In contrast to the usual approaches, 
in which the set of formulae which are satisfied by a certain state are iteratively 
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computed, their algorithm iteratively computes a property transformer for each 
state class of the finite process representation, which is subsequently used for 
solving the original model-checking problem. Later on, this algorithm has been 
complemented by Hungar and Steffen by a local mo del- checking algorithm [12], 
which improves on the algorithm of [5] in that irrelevant parts of the process 
under consideration need not to be investigated. 

In this article we develop a demand-driven mo del- checking algorithm for 
the setting considered in [5, 12], which decides the alternation- free modal mu- 
calculus for CFPSs. Like the algorithm of [12] the new algorithm aims at im- 
proving the aver age- time complexity of the algorithm of [5]. It enjoys advantages 
known from local model checking in that it avoids the investigation of irrelevant 
parts of a process, and simultaneously improves on the original algorithm of [5] in 
that it avoids the computation of irrelevant property transformers and irrelevant 
portions of property transformers. To achieve this we adopt as in [5] the view of 
CFPSs as mutually recursive systems of finite state-labeled transition systems. 
This establishes the link to inter procedural data-flow analysis (cf. [13]), which 
provides the key to our approach. In essence, our algorithm evolves from com- 
bining the spirit underlying the classical second-order algorithm of [5] with the 
idea of demand-drivenness of interprocedural data-flow analysis (cf. [9, 11, 17]). 

It is worth noting that the investigation of irrelevant parts of a process and of 
the computation of completely irrelevant property transformers could be avoided 
in the approach of [5], too; however, at the price of an additional preprocess 
only. Avoiding the computation of irrelevant portions of specific property trans- 
formers, however, is out of the scope of this approach. This, however, is quite 
important in practice because the worst-case time complexity of the algorithm of 
[5] is linear in the size of the finite representation of the process under consider- 
ation, but exponential in the size of the formula to be verified, which, in essence, 
determines the complexity of computing the property transformers.^ Though 
generally demand-driven algorithms have the same worst-case time complexity 
as their standard counterparts, in practice they are usually much more efficient as 
e.g. empirically confirmed in interprocedural data-flow analysis (IDFA) (cf. [9]). 
As in IDFA, we expect that the new, demand-driven algorithm performs signifi- 
cantly better in practice than its counterpart of [5], since the demand-drivenness 
of the new algorithm reduces the computational effort of those parts which are 
responsible for the exponentiality of its counterpart. In general, the performance 
gain will be the larger the more complex the property under consideration is. 

2 Processes and Formulae 

In this section we introduce our setting following the lines of [5, 12]. Central are 
context-free process systems as finite representations of infinite process graphs, 
and the (alternation-free) modal mu-calculus as our logic for specification. 

^ More precisely, it is exponential in the weight of the formula (cf. [5]). Deciding the 
alternation- free modal mu-calculus for CFPSs was recently shown to be EXPTIME- 
complete (cf. [14]), while it is polynomial for any fixed formula (cf. [5]). 
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Definition 1 (Process graph). A process graph (PG) is a quintuple G= (5, 
Act^^^so^Se), where (i) S is a set of states^ {2) Aet is a set of actions^ {3) 
^ C S X Aet X S is the transition relation^ and {4) so^Se C 5 are distinguished 
elements, the ‘^start state^^ and the ^^end state. sq and Se must be originating and 
terminating; respeetively, i.e., there are no a e Aet and s e S with (s, a, 5 q) G ^ 
or {se^a, s) e A PG G is said to be finite-state; when its sets of states S and 
actions Act are finite. 

Intuitively, a process graph encodes the operational behavior of a process. The 
set S represents the set of states the process can enter, Aet the set of actions 
the process can perform, and ^ the state transitions which may result upon 
execution of the actions. As usual we will write 5—^5' instead of (5, a, 5') G 
Moreover, we will write s— when there is an s' such that s-^s' . 

As in [5], we represent context-free processes, which may have infinitely many 
states, by means of context-free process systems (CFPSs). In essence, a CFPS 
is a set of named finite process graphs, whose set of actions contains the names 
of the system’s process graphs. Transitions labeled with such a name are meant 
to represent the denoted process graph. In this view, the names of the process 
graphs correspond to the non-terminals and the atomic actions to the termi- 
nals of a context-free grammar. Alternatively, transitions labeled by a name 
can be considered a call of the denoted process graph. As in [5] we prefer this 
more dynamic procedural point of view: It establishes the link to interprocedural 
data-flow analysis [18, 13], which is essential for developing our demand-driven 
algorithm for model checking CFPSs. 

Definition 2 (Procedural process graph). A procedural process graph 
(PPG) is a quintuple P=(Ap, Trans,^p,(jp,(jp) , where (i) Up is a set of 
state classes, {2) Trans=df AetUAf is a set of transformations, where Aet is a set 
of atomie actions, andj\f a set of names with AetDAf = 9, (3) ^p = 
is the transition relation, where ^p^^ C UpxAetxUp and ^-p C UpxJ\f xUp, 
and {4) ^p ^ ap e Up are a elass of “start states^^ and “end states. 

A PPG P is ealled finite, if Up and Trans are finite. 

Essentially, a procedural process graph is a process graph, where the set 
of actions is divided into two disjoint classes, atomic actions Aet and (action) 
names Af. A PPG P is called guarded, if all initial transitions of P are labeled by 
atomic actions. It is called simple, if it does not contain any calls. Following [5] 
we denote the set of all simple PPGs by Q. In the following we will only consider 
guarded PPGs P, where the class of end states ap is terminating. 

Definition 3 (Context-free process system). A context-free process system 
(CFPS) is a sextuple V = {Af, Aet, A, Pq, So, Se) , where (1) Af = {Nq, Nn-i} 
is a set of names, {2) Aet is a set of actions, {3) A= 4f{X = PCo < i<n} is 
a finite set of PPG- definitions, where the Pi are finite PPGs with names in Af 
and atomic actions in Act, {4) Po is the “main^^ PPG, and (5) sq and Se are the 
start state and the end state of the system. 

We denote the union of all state elasses of V by U=df Up., and the 

union of all transition relations of V by ^ =df Ul^o ^ Pi' 
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Figure 1 shows a CFPS consisting of three components, which is sufficient to 
illustrate the essence of our approach, and to highlight the essential differences 
between our algorithm for demand-driven model checking of CFPSs and the 
classical algorithm of [5] (cf. Section 4.2). Actually, the CFPS shown is a variant 
of the encoding machine used as the running example in [5]. Intuitively, in its first 
phase the encoding machine of Figure 1 reads a word over the alphabet {a, b} 
until it encounters the word delimiter #. Subsequently, in its second phase it 
outputs the corresponding sequence of encoded characters in reverse order. Note 
that the end state of Pq is not reachable from P 2 . This will be discussed in detail 
in Section 4.2. It allows us to illustrate an important difference between the 
algorithm of [5] and the algorithm here. 






Fig. 1. The running example: The encoding machine. 

Definition 4 (Complete expansion). Let P he a PPG of a CFPS V. The 
complete expansion of P with respeet to V is the simple PPG, whieh results from 

p. 

sueeessively replaeing in P eaeh transition a — 4a' by a eopy of the eorresponding 
PPG Pi, while identifying a with (ip. and a' with Up.. We denote the complete 
expansion of P with respeet to V by Expp{P). 

Figure 2 illustrates the flavour of this stepwise expansion process, which 
reminds to the copy-rule semantics of imperative languages (cf. [1]). A flavour, 
which is also reflected in the following definition. Given a state s G 5 of a 
complete expansion Expp{P), s is called to belong to the state class a G Ap., if 
it emerges as a copy of a during the expansion. Thus, a state class stands for a 
possibly infinite set of states of the corresponding complete expansion. 

If P = (Ap, Trans, (7p,Gp) is a PPG and a a state class of P, we denote 
by P^^^ the PPG (Ap, Trans, ^p, a, a p). If Pi and P 2 are two PPGs, we define 
their sequential composition Pi; P 2 as the PPG Pi; P 2 =df {^"Pi U Ap 2 \{<^P 2 ^’ 
Trans p^ U Trans p^, ^ p-^ U ^ p> , (Jp^ , (Jp^ ) , where ^ p/ denotes the transition 
relation resulting from substituting all occurrences of (ip^ in ^ p^ by (ip _^ . Finally, 
^ppg{cr) denotes the end state of the particular PPG-definition of the underlying 
CFPS containing a. 
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Fig. 2. Expanding a procedural transition system. 



The Mu-Calculus. As in [12] we consider a negation- free sublanguage of the 
modal mu-calculus, which, however, as in [5], is based on an explicit set of atomic 
propositions. Its syntax is given by the grammar ^::=A|X|^V^|^A^| {a)4> \ 
[a]<P I uX.<P I /iA.^, where X ranges over a (countable) set of variables Var, A 
over a set of atomic propositions and a over a set of actions Act. Properties 
will later be specified by closed formulae, i.e., formulae which do not contain 
any free variable. A formula is called alternation- free^ if no //-sub formula has a 
free variable which, in the context of the complete formula, is bound by a i/, and 
vice versa. We denote the set of all closed alternation- free formulae by M. 

As usual the semantics of formulae is defined with respect to a specific (pos- 
sibly infinite) process graph G = (5, Act^ sq, Se), a valuation V : 2*^, and 

an environment e : Var 2*^. Intuitively, it maps a formula (with free variables) 
to the set of states for which the formula is “true.” Correspondingly, a state s 
satisfies A, if s is an element of the set bound to A in environment e. Note that 
the semantics of a closed formula <P is independent of the environment e. We will 
thus write s |= ^ instead of 5 G [ ^ Je for all environments e. 



[ A ]e — V(A) [ (^a)^ ]e 

[A]e = e(A) l[a]d>]e 

[ V 02 le = [ ]e U [ ^2 le | ]e 

[0iA02le = [0i]en[02]e 

[/xA.0]e 



{s I 3 sb s-T^s' A s' G |0]e} 
{s|Vs'. s-^s' A s' G |0]e} 
U{5'C5|5'c[<^>]e[X^5']} 



Table 1. The semantics of the subset of the modal mu-calculus under consider- 
ation. 



Second- Order Semantics. According to Table I the validity of a formula is de- 
fined with respect to single states. An analogous definition for a state class is 
in general not meaningful because the truth value of a formula is in general 
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different for different representatives of a state class. However, two states s and 
s' of the same state class satisfy the same set of formulae if this holds for their 
corresponding end states end{s) and end{s'). As pointed out in [12], this is the 
key to the second-order semantics defined in [5], which can be considered the 
analogue of the second-order functional approach of interprocedural data-flow 
analysis (cf. [18, 13]). It considers the semantics of a (named) PPG as a property 
transformer^ i.e., as a function which yields the set of formulae being valid at 
the start state relative to the assumption that the set of formulae it is applied 
to are valid at the end state. 

Definitions (Second-order semantics). Let P = {Sp^ Trans^^p^Up^Up) 
he a simple PPG. Then we interpret P as the function [ P J : 2-^ ^ 2-^ with 
[ pj{M)=df {^' eM\WP' eg.{P' nP = 9). af. M ^ af, \=p,p^ P'}. 

Theorem 1 guarantees the consistency of the second-order semantics with respect 
to the usual semantics of simple PPGs in terms of its valid formulae (cf. [12]). 

Theorem 1 (Consistency of the second-order semantics). Let P he a 

simple PPG and <P he a closed formula, and let T deadlock denote the set of all 
propositions which are ‘^true^^ at a ‘^dead-lock state/’ i.e., Pdeadlock=df | s |=r 
^ with T={{s},Act,9)}. Then we have: af, ^ ^ lPj{Pdeadlock)- 

3 Hierarchical Equational Systems 

As in [5] we present our algorithm for demand-driven model checking of CFPSs 
in a version where logic formulae are represented by hierarchical equational sys- 
tems. These are equally expressive as the alternation-free modal mu-calculus L/xi 
(cf. Theorem 3 ([10])), but simplify the technical presentation of the algorithm 
and its comparison with the classical algorithm of [5]. Therefore, we briefly recall 
the syntax and semantics of hierarchical equational systems in this section. 

Syntax. Fundamental for the definition of hierarchical equational systems is 
the notion of a basic formula. Their syntax is given by the grammar P ::= 
A|X|^V^|^A^| {a)(p I [a](p. Because of the absence of fixpoint operators, 
they are less expressive than the formulae defined in Section 2. This, however, 
is overcome by introducing (mutually recursive) equational systems. 

Definition 6 ((Equational) blocks and systems). An (equational) block B 
has one of the two forms, min{E} or max{E}, where E is a list of {mutually 
recursive) equations {Xi = , Xn =^n); 'I'ki which each <Pi is a basic formula 

and the Xi are pairwise disjoint, and where min and max indicate the fixed point 
of E desired, i.e., the least and greatest fixed point, respectively. An equational 
system B= {B\, . . . , Bm) is a list of equational blocks, where the variables ap- 
pearing on the left-hand sides of the blocks are all pairwise disjoint. 

Intuitively, a block defines n (mutually recursive) propositions, i.e., one per 
variable (see Section 4.2 for an example). Several blocks may be used to express 




Demand-Driven Model Checking for Context-Free Processes 



207 



complex formulae. In particular, the left-hand side of an equation in one block 
may be referred to in the right-hand sides of equations in other blocks. Since 
we are focusing on the alternation-free subset of the mu-calculus, we restrict 
our attention to hierarchical equational systems. The restrictions they obey (see 
Figure 3 for illustration) guarantee the desired absence of alternating fixed points. 
In terms of Section 2, the formulae they represent are alternation- free (cf. [10]). 

Definition 7 (Hierarchical equational system). An equational system B = 

. . . ^ Bm) is hierarchical; if the existence of a left-hand- side variable of a 
block Bj appearing in a right-hand- side formula of a block Bi implies i < j. 



Semantics. The semantics of a hierarchical equational system B is defined on 
top of the semantics of individual blocks B. To this end, let E be the list of 
equations {X\ . . . , = ^n)- For every environment p, we can now build 

a function : (2*^)^^ ^ (2*^)^^ as follows. Let S = (5i, . . . , Sn) C (2*^)^^, and let 
pg = p[Xi ^ 5i, . . . , Xn ^ Sn] be the environment, which results from p by 
updating the binding of V to Si. Then: f^{S)=df Jpg, • • • , [ jPs)- 

The domain (2‘^)” forms a complete lattice, where the ordering, join, and 
meet operations are the pointwise extensions of the set-theoretic inclusion C, 
union U, and intersection H, respectively. Moreover, for any equational system 
B and environment p, the function is monotonic with respect to the order 
relation of this lattice. According to Tarski’s well-known fixed-point theorem, 
it has thus a greatest fixed point, i//^, and a least fixed point, p/^, satisfying 
i^fE=\J{S\ScrpS)}e.nd ij,f^=f]{S\SDrpS)} (cf. [5]). 

Blocks max{E} and min{E} are interpreted as environments: [ max{E} |p = 
p^^g and {min{E} |p = p^^g. This means that max{E} and min{E} represent 
the “greatest” and the “least” fixed point of B. The relative semantics of a hier- 
archical equational system B = (5i, . . . , Bni) with respect to p is then defined in 
terms of a sequence of environments: Pm = [ Bm Jp, Pm-i = [ \Pm^ • • • : 

Pi = \Bi |p 2 . The semantics of S, finally, is defined by (cf. [5]): | B\p=df Pi- 



The notion of closed 
formulae can easily be 
extended to hierarchi- 
cal equational systems 
B. A basic proposition 
<P is closed with respect 
to S, if every variable 
in appears on the 
left-hand side of some 
equation of some block 
of S. S is closed^ if each 
right-hand side of each 
block of B is closed 
with respect to B. The- 
orem 2 yields that the 



Hierarchical Equational System 

B=<B, 



^2 



Note, the arrows indicate • 

the dependencies 

of blocks! The left-hand-side 

variable at the origin of an 

arrow may occur in the right-hand ®m 

side of its destination. 




Fig. 3. Hierarchical equational systems. 
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consideration of variables is even sufficient (cf. [5]). Theorem 3, subsequently, 
establishes the desired link between the alternation- free modal mu-calculus re- 
called in Section 2 and hierarchical equational systems (cf. [10]). 

Theorem 2. Let B be a elosed hierarehieal equational system, and let ^ be a 
{basie) proposition whieh is elosed with respeet to B. Then we have: (i) | # J[[bip = 
for any p and p' . {2) There is a elosed hierarehieal equational system 
B' having X' as the first variable of its first block such that = 

Theorem 3 (Expressiveness). Let T be a labeled transition system, and let p 
be a eorresponding environment. Then we have: (i) Every formula <P in Lpi ean 
be translated in time linear to the size of E into a hierarehieal equational system 
B with [ ^ J[[pi = I X for some left-hand- side variable X of B. {2) For every 
hierarehieal equational system B and every variable X there is a formula <P in 
Lpi with [X J[bip= 1^ J[[pi. 

4 Demand-Driven Model Checking of CFPSs 

In this section we present our algorithm for demand-driven model checking of 
CFPSs. In order to simplify a direct comparison with its counterpart of [5], we 
present our algorithm for mm-blocks. This is sufficient because the treatment 
of max-blocks is completely dual, and the hierarchical extension required for 
hierarchical equational systems is straightforward in an innermost fashion. 

Conventions. In order to simplify the presentation of our algorithm, we as- 
sume as in [5] without loss of generality the following structure for the {min-) 
block B under consideration: (1) The right-hand sides of blocks consist of simple 
basic formulae, which are characterized by the grammar ::= 4 | X V X | 

X AX I (a)X I [a]X. (2) The graph of the unguarded dependeneies on the vari- 
ables of B, which is defined by having an arc Xi ^ Xj if and only if Xj appears 
unguardedly in the right-hand side of the equation for X^, is acyclic. Impor- 
tantly, a block can always be transformed into an equivalent block of the same 
size satisfying these constraints. The first constraint is established by a step in- 
troducing a new variable for every complex subexpression. The second constraint 
is established by a step, whose essence is summarized below, and where it should 
be noted that Xi is unguarded in the right- hand- side terms. 

mm{Xi = Xi V mm{Xi = max{Xi = Xi A max{Xi = 

min{Xi = Xi Ad>} ff max{Xi = Xi V tt 

Additional Notations. Let 5 be a closed equational mm-block with left-hand- 
side variables X = {X^ 1 1 < i < r}, where Xi represents the property to be 
investigated. Additionally, let (I) V=df 2^ ^2^ be the set of all functions on 
2^, let (2) PT 0 G P be the function which maps every M C X to the empty 
set, and let (3) PT m be the identity on V. Fundamental is then to associate 
with each a G X of a CFPS V a function PTa G V and with each transition 
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a function C V. The function PT|ctj C V represents the property 

transformer of the process graph . It is defined according to the following 

two cases: 

- Case 1: a = Pj e Af: PT | p. j =df PT 

— Case 2: a = a e Act: Let M C and Xj = ^j be some equation of B. 



{ ^j = {a)Xi and Xi e M 
<Pj = [a]Xi and Xi e M 
<Pj = [h]Xi and h ^ a 



If PTi^ i e {1, . . . , /c}, is a property transformer in V and a G P, then the 
function kPT^ is defined as follows. Given M C T and the equation 



Xj =$j of B, M'=df {of=i^^^^^j^PTi){M) is defined by 



Xj eM' m I 



$j = A and a G V{A) 

4>j = Xj^ A Xj^ and Xj^ G M' and Xj^ G M' 

<Pj=Xj^ \/Xj, and {Xj^ G M' or Xj^ G M') 

$j = {a)X' and there is an i G {1, . . . , /c} with Xj G PTi{M) 

$j = [a]X' and Xj G PTi{M) holds for alH G {1, . . . , /c} 



The acyclicity of the graph of unguarded dependencies guarantees that this 
definition is well-founded. This allows us to present our algorithm now in detail. 



4.1 The Algorithm 

The classical algorithm of [5] has a three-step structure. The first step computes 
for every state class of the finite process representation the complete property 
transformer representing their second-order semantics. The second step com- 
putes the set of subformulae of the property <P to be checked, which are satisfied 
by a deadlock state, i.e., Sfead/ocfc* third step, finally, applies the property 
transformer of the main PPG Pq to the argument computed in step 

two in order to decide according to Theorem 1 the model-checking problem. 

Our demand-driven algorithm enjoys a similar structure. However, the com- 
putation of property transformers is only triggered for arguments which are re- 
ally needed in order to decide the model-checking problem. Our algorithm thus 
starts with computing the set of formulae, which are satisfied by a deadlock state 
B deadlock ^ ^^cp, which belongs to the epilogue of the original algorithm. The key 
component of our algorithm is the procedure Process. It is the analogue of the 
procedure update of the algorithm of [5]. In contrast to update^ whose domain are 
predicate transformers, and which thus updates a predicate transformer always 
as a whole, i.e., for all arguments of its domain. Process updates a predicate 
transformer pointwise, i.e., for a single argument at a time. This is essential for 
the demand-drivenness of our approach. The procedure Dynamiclnitialization^ 
finally, is the analogue of the initialization procedure of the algorithm of [5]. 
Again, the essential difference is that Dynamicinitialization works pointwise. 
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i.e., demand- dr ivenly. Whenever the computation of the predicate transformers 
of a process graph is triggered for some argument, they are initialized for this 
particular argument only, instead of for their complete domain. 

Next the demand-driven algorithm is given for the basic case of a closed 
formula <P G A4 of the alternation- free modal //-calculus. We recall that <P is 
assumed to be equivalently given in terms of a (hierarchical) equational sys- 
tem B= (5i), which is assumed to be a mm-block, where Xi denotes the left- 
hand-side variable of the first equation of Bi. As its counterpart of [5] this 
algorithm can straightforwardly be hierarchically extended. After termination 
(of the complete) algorithm the value stored in answerToModelCheckingProhlem 
specifies the truth value of <P with respect to i.e., whether ap^ G |^|. 

( Prologue ) Mdeadlock ■■= f^deadlock^ 

( Main process) already Triggered := 0; workset := 0; 

DynamieInitialization{Po , M deadlock)] 

WHILE workset ^ 0 DO 

CHOOSE {(J,M) G workset; 

workset := workset \{ {(J,M) }; 

Proeess{{a, M)) 

ESOOHC OD; 

(Epilogue) answerToModelCheekingProhlem := X\ G PT^f^ {Mdeadlock)- 

PROCEDURE DynamieInitialization{P, M); 

PT^e^{M):=M; 

FORALL a G A>\{ a|> } DO PT^{M) := 0 OD; 

already Triggered := already Triggered U {(<jp,M)}; 

workset := workset U { (ap, M) } END Dynamielnitialization; 

PROCEDURE Proeess ((a,M)); 

FORALL a'-^a G {a"-^a | a" G A' A o; G Trans} DO 

IF a G AT THEN 

IF PTcr{M)) ^ already Triggered THEN 
DynamieInitialization{a^PT cr{M)) FI FI; 
tmp:=PT^.{M); PT^.{M):= {PT^. , PT^^j o PT^}){M); 

IF PT^^{M) ^ tmp THEN 

IF a' =o-p. for some i G / THEN 

workset := workset U {(cr", M") \ a'^^a" A a" G X A 

^ (^^'^^eadyTriggered A PT^n (M") = M} 
ELSE workset \= workset U {(a',M)} FI FI END Proeess; 



4.2 Discussion 

In this section we compare our demand-driven mo del- checking algorithm with its 
conventional counterpart and illustrate the benefits it offers by means of the run- 
ning example and the property considered for illustration in [5]. This property 
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is as follows: The encoding machine of Figure 1 reads the full (finite) sequence of 
input symbols before it outputs the first code. Restricting the attention to finite 
inputs which are terminated by the character can be rephrased as follows: 
After outputting an encoded character a' or b' the machine will always refuse to 
read #. Below, this property is expressed in terms of a max-block, where Xi 
corresponds to 

In this example the new demand-driven algorithm improves in two respects 
on the algorithm of [5]: (1) The property transformers corresponding to the states 
of PPG P 2 are not computed at all. (2) The property transformers corresponding 
to the states of the PPGs Pq and P\ are only computed as far as it is necessary to 
decide the model-checking problem. Considering Pq, this is particularly evident: 
The computation of the property transformers of the states of Pq is only triggered 
for a single argument, the set of formulae satisfied by a deadlock state, instead 
of for all elements of the power set of subformulae of the complete property. 

In practice, this can be the source of substantial performance gains. 



max { 



Xl=X2^X^ 
X 2 = [a]Xi 

X3=X4AXs 
X4 = [6]Xi 
X5=XeAX7 



^6 = [#]Xi 
X7 = X8AXg 
Xg = [a']Ti 
Xg = [6']Tl 



Ti = y2AT3 
T 2 = [a]Ti 
y3 = T4AT5 
T4 = [6]Ti 

T5 = y6AT8 



^6 = [#]Tr 1 

yr=ff 

Ys = YgAYio > 
Yg = [a']Yi 
Yio = [b']Yi ^ 



5 Conclusions 

Based on the algorithm of [5], we developed a demand-driven mo del- checking 
algorithm deciding the alternation-free modal mu-calculus for context-free pro- 
cesses. The key to this approach was to adopt the idea of demand-drivenness 
developed in interprocedural data-flow analysis to model checking. Though the 
details are different, the resulting algorithm combines advantages known from 
local model checking, i.e., avoiding the investigation of certain irrelevant portions 
of a process, with those of demand-driven data-flow analysis, i.e., avoiding the 
computation of irrelevant (portions of) property transformers. Besides adding a 
new instrument to the orchestra of model-checking approaches, we expect that 
the new algorithm behaves significantly better in practice than its classical coun- 
terpart because the demand-drivenness reduces the computational effort of those 
parts, which are responsible for the exponentiality of the classical algorithm. An 
empirical confirmation of this fact would comply with practical results obtained 
for demand-driven data-flow analysis. Additionally, we are investigating how to 
adopt and generalize the idea of demand-drivenness to model checking of push- 
down process systems in the fashion of [6] . 
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Abstract. We present an algebra for programming the itineraries of mobile a- 
gents. The algebra eontains operators for modelling sequential, parallel, nondeter- 
ministie, and eonditional mobility behaviour. Iterative behaviour is also modelled 
by a language of regular itineraries borrowing ideas from regular expressions. We 
give an operational semanties for the operators using Plotkin-style transition rules 
and provide examples of itineraries for meeting seheduling, sales order proeess- 
ing, and network modelling. 



1 Introduction 

Mobile agents can be regarded as software components which can move from one host 
to another to perform computations. Mobile agents have many benefits for distributed 
computing such as reducing network load by moving computation to data and asyn- 
chronous computation [5]. With the growing popularity of the mobile agent paradigm 
(e.g., [12, 10]) and the increasing ubiquity of networked computers, we envision future 
mobile agent applications involving complex movements of agents over large scale en- 
vironments with hundreds or thousands of nodes within and between enterprises and 
homes. Programming the mobility behaviour of these agents would be a challenge. 

Recent work such as Java-based Moderator Templates (JMTs) [7] which enable 
agent itineraries or plans to be composed and coordination patterns [13] provide ab- 
stractions for the mobility behaviour of agents. This paper aims to take such work fur- 
ther by providing an algebra of itineraries for reasoning with and programming the 
mobility behaviour of mobile agents. The goals of this language are fourfold: 

- Encourage separation of concerns to simplify mobile agent programming. The mo- 
bility aspect of a group of agents are abstracted away from code details implement- 
ing the computations the agents are to perform on hosts. Herein lies a similarity 
between our approach to mobile agent programming and the emerging paradigm of 
aspect-oriented programming (AOP) [4]. 

- Provide top-level structuring of a mobile agent application. 

- Provide economy of expression of mobility behaviour. The programmer expresses 
behaviour such as “move agent A to place p and perform action a” in a simple direct 
succinct manner without the clutter of the syntax of a full programming language. 

- Eacilitate identification and reuse of patterns in mobility behaviour. 
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The rest of this paper is organized as follows. In §2, we describe the mobile agent 
model on which our work is based. In §3, we describe the syntax and operational se- 
mantics of agent itineraries, and discuss two example itineraries. In §4, we discuss al- 
lowing the destination of agents to be dynamically determined by operations, and give 
an itinerary for network modelling as example. In §5, we present related work and con- 
clude in §6. 

2 Mobile Agent Model 

A mobile agent architecture consists of agents and places. A place receives agents and 
is an environment where an agent executes. A place has an address which we call a 
place address. Typically, as in many agent libraries, a place is a server which can be 
addressed using a hostname and a port number (e.g., www. dstc . edu . au : 8 88). 

We use an object-oriented model of agents. We assume that an agent is an instance 
of a class, and roughly, we define a mobile agent as follows: 

mobile agent = state + action + mobility 

State refers to an agent’s state (values of instance variables) possibly including a re- 
flection of the agent’s context. Action refers to operations the agent performs to change 
its state or that of its context. Mobility comprises all operations modifying an agent’s 
location, including moving its state and code to other than the current location. While 
mobility assumes that an agent moves at the agent’s own volition, the itineraries may 
be viewed as a specification or plan of agent movements. A community of agents is 
considered to implement the movement plan correctly, if their observed behaviour cor- 
responds to (i.e., “simulates”) the plan. This links our calculus with other more general 
calculi for concurrent objects, such as the pi-calculus. 

We assume that the agents have the capability of cloning, that is, creating copies of 
themselves with the same state and code. We also assume that agents can communicate 
to synchronize their movements, and an agent’s code is runnable in each place it visits. 

3 An Algebra of Itineraries 

Let (where {0, 1}* denotes strings of zero or more zeroes or ones), O and P 

be finite sets of agent, action and place symbols, respectively. In expressions we use 
letters . . . G a,b,c, . . . G O and p,q,r, . . . G P. Also, we use the 

following convention for generating names for agents and their clones. When an agent 
A is cloned, agent A is renamed A^ and its clone is named A^ . Similarly, when A^ is 
cloned, we have (A°)° (or A°°) and {A^y (or A^^, and for A^, we have A^^ and A^^, 
and so on. More generally, to any A G we add superscripts 0 and 1 to form A^ 

and A^. Note that with this convention, the name of the original agent is in For 

example, and denote original agents. 

Itineraries (denoted by T) are now formed as follows representing the null activ- 
ity, atomic activity, parallel, sequential, nondeterministic, conditional nondeterministic 
behaviour, and have the following syntax: 

X::=0 \ Al \ A^ \ (X || X) | (X X) | (X | X) | (X X) 
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where A e a e O, p e P, PO C P ^ P is a set of placement operations 

described in §4, I e Lpo which is the language of regular itineraries described in §4, 
0 is an operator which combines an agent with its clone to form a new agent, and 77 
is an operator which returns a boolean value to model conditional behaviour. Whenever 
0 and 77 are not part of current dicussions, for simplicity, we write “ •” and 
Below, we give an operational semantics to the above operators. 



3.1 Operational Semantics 

We first define ags{I) to mean the set of agents (symbols) occurring in the itinerary 7. 
ags{I) is defined inductively as the minimal subset of satisfying: ags{0) = 0, 

ags{A^) = {A}, ags{A^) = {A}, ags{I • J) = ags{I) U ags{J), ags{I || J) = 
ags{I) U ags{J), ags{I \ J) = ags{I) U ags{J), and ags{I : J) = ags{I) U ags{J). 

We also define configuration which is a relation involving agents, actions and places, 
denoted by S (and by 77', 77" . . . ), where 77 C x O x P. Given a collection of 

agents, the configuration represents the location and action of each agent. A constraint 
on a configuration is that it defines a function from to P, the same agent cannot 

be in two places in the same configuration. 

The operational semantics is given by Plotkin-style rules defining transitions from 
one configuration to another of the form [condition]. Declaratively, the con- 

clusion holds if the premises and condition hold. Procedurally, to establish the conclu- 
sion, establish the premises and then check that condition holds. We write 77 77' to 

mean an itinerary 7 causes a transition from configuration U to configuration A". 

We assume that all agents in an itinerary have a starting place (which we call the 
agent’s home) denoted by /z G P. Given a collection of agents Agts, we define a dis- 
tinguished configuration called the home configuration denoted by 77^^^^. The home 
configuration is the configuration where all the agents are at home. For example, the 
home configuration for a collection of agents A, B and C is: 

{(A, id, h), {B, id, h), {C, id, h)} 

Note that the operations at home are the identity action id e O. Also, all agents used in 
an itinerary must be mentioned in the home configuration, i.e. none of the rules which 
follows can add new agents unless they are clones of existing agents. 

Agent Movement (A^). A^ means “move agent A to place p and perform action a”. 
This expression is the smallest granularity mobility abstraction. It involves one agent, 
one move and one action at the destination. The underlying mobility mechanisms are 
hidden. So are the details of the action which may change the agent state or the context 
in which it is operating at the destination place: 

a : states{A) x states{p) states{A) x states{p) 

In our agent model, each action is a method call of the class implementing A. The 
implementation must check that a is indeed implemented in A. 

0 represents, for any agent A, the empty itinerary A^^^g, where the agent performs 
the identity operation id e O on the state at its current place here. 
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The transition rule for agent movement replaces a triple from a configuration U\ 

( 1 ) 

For example, applying this rule to the home configuration with the expression gives 
a new configuration with only the location and action of A changed: 

{(A, id, h), {B, id, h), {C, id, h)} {{A, a,p), {B, id, h), {C, id, h)} 



Parallel Composition C‘||”)* Two expressions composed by “||” are executed in par- 
allel. For instance, (A^ || B^) means that agents A and B are executed concurrently. 
Parallelism may imply cloning of agents. For instance, to execute the expression (A^ || 
Aq), where p q, cloning is needed since agent A has to perform actions at both p and 
q in parallel. In the case where p = q, the agents are cloned as if p q. In general, 
given an itinerary (/ || J) the agents in a^s(7) na^s(J) are cloned and although having 
the same name are different agents. In the transition rule below, we show how clones 
are distinguished for the purposes of specifying the operational semantics. 

Before presenting the rule, given Agts C and a configuration B, we define 

the operator ” which removes triples mentioned in Agts from B: 

B — Agts = B \ {{A,a,p) \ A G Agts, {A,a,p) G B} 

The transition rule for parallel composition is as follows: 



renamed^ (B) 



B' a renamed^ (B) 



S S' U S" 



renamed^ {B) is B without the triples concerning agents in ags{J) and with agents in 
ags{I) n ags{J) renamed: renamedP{B) = {B — ags{J)) U {{A^,a,p) \ A G 

{ags{I) n ags{J)),{A,a,p) G B}. Similarly, renamed^{B) = {B — ags{I)) U 
\(A^,a,p) I A e (ags(I)r]ags{J)), (A,a,p) € S}. 

This means that agents to be cloned (i.e. those mentioned in ags{I) H ags{J)) and 
clones are renamed apart. With the naming of clones, when the resulting configurations 
are combined in B' U B" , the clones retain their identities. 

We use renamed^{I) to denote the corresponding renaming of I (de- 
fined recursively on the algebra): renamed^ {0) = 0, renamed^ {A^) = 

Ap^ if A G ags{I) H ags{J), renamed^ {A^) = A^ if A ^ cigs{I) H ags{J), 

renamed^ {A^) = A^ ^ if A G ags{I) H ags{J), renamed^ {Af) = 

Af if A ^ ags{I) n ags{J), renamed^{I • J) = renamed^{I) • renamed^{J), 
renamed^ {I || J) = renamed^ {!) || renamed^ (J), renamed^ {I \ J) = 

renamed^ {!) \ renamed^ (J), and renamed^ {I : J) = renamed^ {!) : 

renamed^ {J). renamed^ is defined similarly. 

Parallel composition splits the configuration B into renamed^ {B) and 
renamed^ {B) each acted on separately by I and J respectively. An example illustrates 
this. 

Let 27 = {{A,a,p),{B,h,q),{C,c,r),{D,d,s)}, I = A^ - B^, and J = 
Then, renamed^{B) = {{A^ ,a,p),{B,h,q),{D,d, s)}, renamed^{B) = 
{{A^,a,p),{C,c,r),{D,d,s)}, renamed^{I) = A^^ • B^, and renamed^{J) = 
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Aif • Cl And so, g,v)lD,d, s)} and A" = 

{(A^, /, u), (C, h, w), {D, d, 5)} (by the semantics of given below). The resulting 
configuration is U'UU" = {(^0, e, t), {A\f, u), {B, g, v), {C, h, w), {D, d, 5)} which 
contains A^ and its clone A^. In the next operator, combining of clones is carried out. 

Sequential Composition Two expressions composed by the operator are 

executed sequentially. For example, (A^ • A^) means move agent A to place p to perform 
action a and then to place q to perform action b. Sequential composition is used when 
order of execution matters. In the example, state changes to the agent from performing 
a at p must take place before the agent goes to q. 

Sequential composition imposes synchronization among agents. For example, in 
the expression (A^ || the composite action (A^ \\ must complete before 

starts. Implementation of such synchronization requires message-passing between 
agents at different places or via shared memory abstractions. 

When cloning has occurred, sequential composition performs decloning, i.e. clones 
are combined. For example, given the expression (Af || A|) • A^ and suppose that 
after the parallel operation, the configuration has clones. Then, decloning is carried out 
before continuing with A{. 

The transition rule for sequential composition is: 

1.’ B’ A decloned{B') B” 

T.J II 

A A" 

where decloned{B') is the configuration A"' with all clones substituted by their com- 
binations. 

We now define decloned. When two clones A^^ and A^^, for some x G {0, 1}*, are 
combined, we name the combined agent A^. We denote the combination operation by 
0 : (aI°T}*x ^ AI^T}* xhe semantics of 0, i.e. how the states and code of 

the clones are combined is left to the implementation. Also, the place where an agent is 
combined with its clone is the agent’s place. The clone which is not at that location will 
move to that location before combination takes place. For example, when combining 
A^^ and A^^, A^ resides at A^^’s place and A^^ must move to that place before the 
combination takes place. Let A and A' be configurations such that decloned{S) = S'. 
Then S' is computed by the following algorithm where, in each iteration, two clones 
are replaced by their combination until no combinations are possible: 

1. If there exists (A^°,a,p), (A^^,6, g) G S\ for some x G {0, 1}*, then (2), else 
A' := A (finish). 

2. Modify V B := {B \ {{A‘^^,a,p),{A‘^^,b,q)}) U {{A‘^,id,p)} 
where A^ = A^^ 0 A^^. Go to (1). 

For example, decloning the configuration {(A°, a,p), (A^°, 6, q), (A^^, c, r), (A, d, s)} 
gives the following result after the first iteration: {(A°, a, p ) , (A^ , id, q) , (A, d, s)} and 
after the second (final) iteration, we have: {(A, id,p), (A, d,s)}. The combined agent 
has the identity operation id at p. 

We can associate the operator “0” with “ by writing “ e”. 

In the expression (Af || Af ) • A{, cloning could take place at the following desti- 
nation u. However, in other expressions such as (Af || A|) • (A{ || Ag) there are two 
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possible destinations u and v. Also, note that since the clones have the same names at 
this level (they are given distinguished names only for the purposes of specifying their 
operational semantics), it is also unclear from the expression which agent (the origi- 
nal or its clone) goes to or to Our operational semantics solves this ambiguity by 
viewing the above expression as causing a cloning (in the first parallel composition), a 
combination (in the sequential composition), and another cloning (in the second parallel 
composition). 



Independent Nondeterminism (^‘|”)* An itinerary of the form (I \ J) is used to 
express nondeterministic choice: “I don’t care which but perform one of I or J”. If 
ags{I) n ags{J) ^ 0, no clones are assumed, i.e. I and J are treated independently. 
It is an implementation decision whether to perform both actions concurrently termi- 
nating when either one succeeds (which might involve cloning but clones are destroyed 
once a result is obtained), or trying one at a time (in which case order may matter). 

Two transition rules are used to represent the nondeterminism: 










(4a) 




A" 



A" 



(4b) 



These rules show that I \ J leads to one of two possible configurations. 



Conditional Nondeterminism (^^”)* Independent nondeterminism does not specify 
any dependencies between its alternatives. We introduce conditional nondeterminism 
which is similar to short-circuit evaluation of boolean expressions in programming lan- 
guages such as C. 

We first introduce status flags and global state function: 



- A status flag is always part of the agent’s (say, A’s) state, written as A.status. 
Being part of the state, A.status is affected by an agent’s actions. A.status might 
change as the agent performs actions at different places. 

- A global state function II : x O x P) ^ {true, false} maps a config- 

uration to a boolean value. II need not be defined in terms of status flags but it is 
useful to do so. For example, we can define 77 as the conjunction of the status flags 
of agents in a configuration A: 77 (A’) = ^ A.status 

We can view 77 as producing a global status flag. From the implementation view- 
point, if a configuration involves more than one agent, these agents must commu- 
nicate to compute 77. 

The semantics of conditional nondeterminism depends on some given 77 in the 
way described above. We express this dependency of on a 77 by writing “ 177 ”. The 
transition rule for “ 177 ” is as follows: 



= true] (5a) 



A' A decloned{U') — > U'‘ 



A ■ 






[77 (A') = false] (5b) 



Rule (5b) is similar to rule (3). 
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Note that we can introduce variations of with different global state functions 
such as “lij” and “ 177 /”, and similarly with “* 0 ” and “ e'”. The operational se- 
mantics of the operators and how 0 and 77 are used is specified as above (and so, 
application-independent), but definitions for 0 and 77 are application-specific. 

Given the above rules, an itinerary 7, a set of agents ags{I), and home configuration 

conclusion JJ' (i.e. computing S') will form a 

derivation tree whose root is that conclusion, whose internal nodes are derived using 
the above rules and whose leaves are empty. 

Algebraic Properties of Itineraries. Due to space limitations, we will not discuss the 
algebraic properties of the operators in detail in this paper but state without proof the 
following properties used in our examples: 

- associativity: from the above definitions, sequential, parallel and both nondeter- 

ministic compositions are associative, i.e. for example, 7 • ( 7 • 77) = (7 • J) • 77. 
Note that with parallel composition, starting with the same configuration S\ if 
7 II (J II 77) brings 0^ to 0^' and (7 || J) || 77 brings 0^ to 0^", then S' is 
same as S", except for naming of clones (e.g.. S' contains and and 

S" contains and A^). Associativity lets us leave some brackets out in 

composite expressions. 

- distributivity of over ^‘|”: (7 | J) • 77 = (7 • 77) | (7 • 77) 



3.2 Two Examples 



Meeting Scheduling. We use a two phase process for demonstration purposes: (1) 
Starting from home, the meeting initiator sends an agent which goes from one partici- 
pant to another with a list of nominated times. As each participant marks the times they 
are not available, the list of nominated times held by the agent shortens as the agent 
travels from place to place. After visiting all places in its itinerary, the agent returns 
home. (2) At home, the meeting initiator selects a meeting time from the remaining 
unmarked times and informs the rest. 

With four participants (excluding the initiator), the mobility behaviour is given by: 



A ask A ask a ask a ask a finalize / a in form || a in form || a inform || Ainform\ 

• Zig • • Zig • II II II Zig ) 

ask is an action which displays unmarked nominated times to a participant and allows 
a participant to mark times he/she is unavailable, finalize allows the meeting initiator 
to select a meeting time from the remaining unmarked times, and inform presents the 
selected meeting time to a participant. Note that the expression of mobility is separated 
from the coding of these three actions. 



Sales Order Processing. We consider a scenario adapted from [9] for processing sales 
orders in a virtual enterprise. Each sales order is carried out by a mobile agent which 
moves through several entities to process the order. We first name the entities. Let us. sc 
be a place where the agent can interact with the US stock control, asia.sc be a place 
where the agent can interact with the Asian stock control, mat be a place where the 
agent can purchase raw materials for manufacturing the products requested in a sales 
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order, man be the place where the agent can place an order for products to be manufac- 
tured (i.e., man represents the manufacturer), and ext be a place where the agent can 
interact with an external buyer. Also, let query be an action where the agent queries a 
stock control, report be an action where the agent reports the results of a completed 
sales order, buy. raw be the action of purchasing raw materials, buy .prod be the action 
of buying products for a sales order, and order be an action of placing an order to have 
some products manufactured. 

The business logic for processing a sales order is as follows. The agent first receives 
an order while at home. Then, one of the following takes place. 

1. The agent checks with the US stock control to see if the requested products are 

available. If so, the agent returns home reporting this. We can represent this be- 
haviour as . 

2. Otherwise, the agent checks with the Asian stock control, and if the requested prod- 
ucts are available, reports this at home. This behaviour is captured by sc ' 

^report 

3. If the Asian stock control does not have the products available, the agent purchases 
raw materials for manufacturing the product and places an order for the product 
with the manufacturer. Thereafter, the agent reports what it has done at home. We 
write this behaviour as 

4. Alternatively, if the agent cannot fulfill (3), for example, the raw materials are too 
expensive. The agent buys the products from an external buyer and reports this: 

A buyjprod a report 

^ext ' 

In essence, there are four ways to process a sales order and we just want to perform 
one of them (each in the sense described above). We can capture the essence of the 
business logic as follows: 



/ Aquery Areport\ i / Aquery Are: 
K^us.sc * ) I \^asia.sc ' 



w j^orde’ 



^j^query j j^query | ^^buy.raw ^order^ | jp>uy.prod^^ j^report 

(by distribution of • over | ) 



However, the above itinerary does not model the fact that the four ways of processing 
a sales order are tried sequentially and the next way is used only when one way has 
“failed” (e.g., if the product is not in stock, then get it manufactured). 

Using conditional nondeterminism, the sales order agent’s behaviour is: 

(Aquery . Aquery f a buy. raw ^ Aorder\ . Abuy.prod\ ^ a report 

\^us.se 'n ^asia.se \^mat ^man ) ^ext ) 

This more precisely models the business logic. The operator is non-deterministic in the 
sense that the resulting configuration of an operation I \n J either decided by (5a) 
or (5b). However, the decision of which rule depends on 77. For example, assuming we 
use the definition of 77 in §3.1 which is in terms of status flags, if no stock is available 
at us.se, then, A.status would be set to false. Note that it is left to actions of A to 
properly set A.status to reflect the intended purpose of the application. 



4 Regular Itineraries 

So far, the place p in A^ is assumed to be a constant. This section replaces p with com- 
puted destinations and introduces iterative behaviour produced by repeatedly applying 
the same computation to an agent’s current location. 
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Placement Operation. Given a set P of place symbols, we define a partial function 
o : P ^ P which maps a place to another which we call a placement operation. For 
some places, o is undefined, i.e. returns i_, and always o( J_) = In place of a place 
constant, we can write a placement operation. 

denotes “move agent A to the place obtained by applying o to A’s current posi- 
tion”: 



lo(p) # -L] (I’a) 

^ (i: \ {(A b,p)}) U {{A, a, o{p))} 

In case o{p) = _L, we define as the empty itinerary 0, i.e. the agent A does nothing 
and stays at its current position: 



[o(p) = -L] (I’b) 

^ (^’ \ {{A,b,p)}) U {{A,id,p)} 

We consider A.here denoting agent A’s current location as part of the 
agent’s state like A.status. This permits the agent to record its own location 
for its own use in its actions. For example, given the current configuration 
{(A, (i, 5), e, t), (5, 6, g), (C, c, r)}, A.here = s, A^ .here = t, B.here = g, 
C.here = r. Note that the value of A.here for any agent A might be changed by 
itineraries involving A but always has the initial value of h. 

We do not specify how o computes its value which is application-specific, but note 
that evaluation of o occurs at the agent’s current location. Placement operations are 
intended for capturing the idea that the agent’s destination could be dynamically com- 
puted at run-time rather than declared statically as in the previous section. 

Itineraries involving mix of place symbols and placement operations is admitted, 
such as Ap • A^, which moves agent Atop and then to o{p). 

We also define operators to allow placement operations to be combined. The opera- 
tors are similar to those in regular expressions as we show below. 



Placement Language. Let PO C (P ^ P) be a set of placement operations. We 
define a placement language Lpo representing combinations of placement operations 
in PO as follows. The members of a placement language are called regular itineraries. 

Let h,l2 G Lpo. Then PO C Lpo, i.e. Lpo contains all placement operations in 
PO, (/i II I2) G Lpo, (/i • I2) ^ (^1 I h) ^ (^1 • h) ^ f^po, ^ f^po, 

/i* G Lpo, and no other strings are in Lpo- 

The semantics of the above operators for placement operations are induced by that 
for the operators for itineraries. Also, we always apply the regular itineraries on A.here. 
Let hM& ipo, ^ e A, a e O. Then, A“ || A^, ■ A^, 



ni\h 



= Af^ I = Af^ : A?„, and = A' 









% 



■w 



n times 

Also, let f^{x) be n repeated applications of function f on x (e.g., f^{x) = 
(/•/•/)(^) = /(/(/(^)))? where is function composition), I G Lpo be some se- 
quential composition of placement operations (i.e., I = o\ • 02 ' . . . Ok, where k G 
and Oi G PO for 1 < i < /c), and rew{l) be a rewriting of I in reverse by replacing 
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with (e.g. rew{oi • 02 * 03) = 03.02.01). Then, we define as ^ , 

n times 

where n is the smallest integer such that rew{l)^'^^^\x) = i_, and X is the cur- 
rent location of A just before the start of the operation A^^. For example, if the cur- 
rent location of A is p, / = 01*02, 02(oi(p)) ^ A, 02(oi(o2(oi(p)))) 7^ A and 
02(oi(o2(oi(o2(oi(p)))))) = J_, then rei(;(/) = 02.01, and (o2.oi)^(p) = J_, and so, 
A^^ _ A^^^ If there is no such n, we define A^^ = 0. In addition, we define 

A^^ = 0 for / not a sequential composition of placement operations since expressions 
like (oi II 02)(p) are not defined as functions over P, though we have lifted sequential 
composition to function composition using rew. We can think of and as iterators 
over the places obtained from repeatedly applying 1 . 

An Example: Network Modelling. In [14], mobile agents are applied to network mod- 
elling, where the agent visits a collection of nodes identifying those nodes satisfying 
some criteria (e.g., host with CPU utilization > 90% and has less than 1GB of disk 
space). The agent which is injected into the network carries a pre-configured itinerary 
describing a traversal of a collection of nodes. 

Let P be the collection of nodes (places), next : P ^ P be the placement operation 
which given a node returns the next node in the traversal, A be the agent, and test 
be the action performed by the agent on a node to determine if the node satisfies a 
specified criteria. Nodes passing the test are recorded by the agent. The last node piast 
in the traversal is modelled by next{piast) = We send the agent home after it has 
completed its traversal which reports its results via the operation report. The agent’s 
itinerary is represented by: . 

We can extend the above itinerary so that the agent traverses three different domains, 
each domain a collection of nodes, and two adjacent domains connected by a gateway 
node. Let P,Q,R be the collection of nodes (places) in each domain, nextp : P ^ P, 
nextQ : Q ^ Q, next^i : R ^ R be placement operations for each respective 
domain, pq be the gateway node between domains P and Q and qr be the gateway node 
between domains Q and R, and see be an action where the agent requests permission 
to cross into another domain. Then, the agent’s extended itinerary is as follows: 

Atest . A sec . A test . A sec . A test . a report 

^next-p^ * ^pq * ^nextQ* ' ^qr * ^nextp^ ' 

5 Related Work 

In many Java mobile agent toolkits such as IBM Aglets [7], a sequential itinerary can 
be defined as a separate object, where each element of the itinerary is a pair: (place 
address, method call). Such itineraries can be defined using our “ •” operator. 

Emerging work attempts to provide further aid for programming more complex 
agent itineraries. For example, JMTs mentioned in §1 enable agent itineraries or plans 
to be composed where the plans may involve parallelism, cloning and combining of 
clones. We contend that our itineraries provide greater economy of expression for repre- 
senting mobility behaviour. Moreover, although we have adopted an object-based agent 
model, our itineraries are programming language independent - actions could have been 
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written in C, say. However, JMT would provide a convenient platform for implementing 
our itineraries with Java. 

The coordination patterns for information agents described in [13] have more de- 
tails about how they are to be used but are presented less formally than our itineraries. 
Finite state machines (FSMs) are used to represent agent itineraries in [6]. Each state of 
the machine represents a task which is either some computation at a place or a move- 
ment to another place. Sequential and nondeterministic dependencies between tasks 
are expressed conveniently with FSMs. In [2], interaction diagrams are used to repre- 
sent agent mobility behaviour, vertical lines represent places and arrows between these 
lines represent movements of agents between places. Such a visual notation is highly 
appealing for non-programmers. In the PLANGENT mobile agent system [8], plans 
are dynamically created from rules expressed in the declarative KIF language. Their 
advantage is dynamic planning but the plans are sequences of tasks: parallelism is not 
expressed. In contrast to these techniques, we give a formal and compositional approach 
to itineraries. 

The Piet language [11] derived from the Pi-Calculus and the Join-Calculus lan- 
guage [3] derived from a variant of the Pi-Calculus both contain a parallel composition 
operator for concurrently executing processes. However, Piet does not have the concept 
of agent places - where processes execute is not explicit. Join-Calculus has the concept 
of addressable locations corresponding to agent places. The locations are themselves 
mobile and have embedded code, and so, locations also correspond to mobile agents. 
However, an individual agent view of mobility is employed: a location’s code con- 
tains a primitive go ( addre s s ) which moves the location itself to another location at 
address. In contrast, our itineraries take a top-level (“God-view”) of mobile agents 
where mobility is expressed outside of the agent’s actions. Moreover, Join-Calculus 
does not have implicit cloning. 

The ambient calculus [1] focuses on movements across administrative domains and 
boundaries. An ambient is a named location which may contain processes or subambi- 
ents. Ambient’s capabilities are moving inside or outside other ambients, and removing 
the boundary of an ambient exposing its contents. There are operators for replicating 
and parallel composition of processes and ambients. In contrast to the ambient calculus, 
our algebra focuses on itineraries of movements. We have assumed a flat space of places 
(modelled simply as a set P). We could explore a more structured space of places by 
modelling P as an ambient with subambients. 

6 Conclusion 

We have presented an algebra of itineraries for programming the mobility aspect of 
agents, and illustrated our operators via examples in three scenarios: meeting schedul- 
ing, sales order processing and network modelling. We are exploring templates for 
itinerary reuse, and extensions to atomic movement (given in §3): (1) When an agent 
moves from p to q, it goes from p to a mediator place 5 and then to q. The advantage 
of this is greater fault tolerance and allowance for mobile places: the agent can wait at 
5 for q to reconnect to 5. Similarly, we can also model domain crossings if the media- 
tor place is a gateway. (2) Optimise bandwidth usage by first moving the agent’s state 
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over to the destination, and then, only transfer the code if the agent’s code for actions 
is not already available at the destination. If places run on small devices such as PDAs, 
a further optimization is not to send the entire agent’s code but only code required for 
the specified action. (3) Generalize our place-level units to domain level units each con- 
sisting of an agent group - a collection of mobile agents perhaps with a leader agent, 
a domain - an Intranet consisting of a network of places, and a domain action - not a 
method call but refers to the agent group’s action in the domain. For instance, in the 
following expression, an agent group G moves sequentially through domains P, Q, and 
R collecting the information from each domain: Qcoi.aii . Q^Laii . Qc^i.aii ^ 
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Abstract. We investigate OReX, a temporal logic for specifying open 
systems. Path properties in OReX are expressed using cj-regular expres- 
sions, while similar logics for open systems, such as ATL^ of Alur et ah, 
use LTL for this purpose. Our results indicate that this distinction is an 
important one. In particular, we show that OReX has a more efficient 
model-checking procedure than ATL^, even though it is strictly more 
expressive. To this end, we present a single-exponential model-checking 
algorithm for OReX; the model-checking problem for ATL^ in contrast 
is provably double-exponential. 



1 Introduction 

Reactive systems are computing systems where the computation consists of an 
ongoing interaction between components. This is in contrast to functional sys- 
tems whose main aim is to compute sets of output values from sets of input 
values. Typical examples of reactive systems include schedulers, resource alloca- 
tors, and process controllers. 

Pnueli [Pnu76] proposed the use of linear- time temporal logic (LTL) to spec- 
ify properties of reactive systems. A system satisfies an LTL specification if 
all computations of the system are models of the formula. Later, branching- 
time temporal logics such as CTL and CTL* [CE81, EH86] were developed, 
which permit explicit quantification over computations of a system. For both 
linear-time and branching-time logics, the model checking problem — verifying 
whether a given system satisfies a correctness property specified as a tempo- 
ral logic formula — has been well studied. Model checkers such as Spin for LTL 
[Hol97] and SMV for CTL [McM93] are gaining acceptance as debugging tools 
for industrial design of reactive systems. 

Alur et al. [AHK97] have argued that linear-time and branching-time tempo- 
ral logics may not accurately capture the spirit of reactive systems. These logics 
treat the entire system as a closed unit. When specifying properties of reactive 
systems, however, it is more natural to describe how different parts of the system 
behave when placed in contexts. In other words, one needs to specify properties 
of open systems that interact with environments that cannot be controlled. 

Partly supported by IFCPAR Project 1502-1. 

P.S. Thiagarajan, R. Yap (Eds.): ASIAN’99, LNCS 1742, pp. 227-238, 1999. 

(c) Springer- Verlag Berlin Heidelberg 1999 
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A fruitful way of modelling open systems is to regard the interaction between 
the system and its environment as a game. The system’s goal is to move in such a 
way that no matter what the environment does, the resulting computation satis- 
fies some desired property. In this formulation, the verification problem reduces 
to checking whether the system has a winning strategy for such a game. 

The alternating temporal logic ATL* proposed in [AHK97] reflects the game- 
theoretic formulation of open systems. In ATL*, the property to be satisfied in 
each run of the system is described using an LTL formula, and LTL formulas 
may be nested within game quantifiers. The quantifiers describe the existence 
of winning strategies for the system. Moreover, a restricted version of ATL* 
called ATL has an efficient mo del- checking algorithm. However, the model- 
checking problem for the considerably more expressive logic ATL* is shown 
to be 2EXPTIME-complete which compares unfavourably with the PSPACE- 
completeness of CTL*. 

In this paper, we propose a logic for open systems called Or^x, which is inter- 
preted over A-labelled transition systems and uses cj-regular expressions rather 
than LTL formulas to specify properties along individual runs. As in ATL*, 
(x;-regular expressions may be nested within game quantifiers. Since cj-regular 
expressions are more expressive than temporal logics for specifying properties of 
infinite sequences, our logic is more expressive than ATL*. Moreover, it turns 
out that Or^x has an exponent ial-time model checking algorithm, which is con- 
siderably better than the complexity of ATL*. 

The paper is organized as follows. Section 2 contains some basic definitions 
about transition systems and games. Section 3 introduces the syntax and seman- 
tics of ORgX and gives some examples of how to write temporal specifications 
in the logic. Section 4 describes an automata-theoretic algorithm for ORgX’s 
model-checking problem. Section 5 discusses related work and Section 6 offers 
our concluding remarks. 



2 Transition Systems, Games, Strategies 



Transition systems A U -labelled transition system is a tuple TS' = (A, A, — >) 
where A is a finite set of states^ U is a finite alphabet of actions and — ^ C 
A X A X A is the transition relation. We use x, y, . . . to denote states and a,b, . . . 
to denote actions. 

An element (x^a^y) G — ^ is called an a-transition. Eor S' C A, the set of 
S-transitions is given by — >s = Z/) | R ^ S}. As usual, we write x y to 

denote that {x,a,y) G — The set of transitions enabled at a state x, denoted 
enabled{x)^ is the set of transitions of the form x — ^ y that originate at x. To 
simplify the presentation, we assume that there are no deadlocked states — that 
is, for each x G A, enabled{x) fy 0. 

A path in TS is a sequence xq x\ • • • x^ x^+i • • *. Let paths{TS) 
and finite_paths{T S) denote the set of infinite and finite paths in T S, respectively. 
For a finite path p, let last{p) denote the final state of p. 
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Games and strategies Let S C U. An S -strategy is a function / from 
finite_paths{T S) to the set of S'-transitions — >s such that for each 
p G finite_paths{T S) ^ f{p) C enabled{last{p)) . If — >s 0 enabled{last{p)) ^ 0, we 
require that f{p) ^ 0. 

An infinite path p = xq x\ • • • in paths{TS) is said to 

be consistent with the S -strategy f if for each n > 1 the transition Xn 

belongs to the set f{xo ^n-i) whenever an G S. 

A game over TS' is a pair (77, x) where 77 is a subset of paths{TS) and x is a 
state in X. We say that S has a winning strategy for the game (77, x) if there is 
an 5-strategy / such that every infinite path that begins at x and is consistent 
with / belongs to 77. 

Often, we are interested in restricting our attention to fair computations. An 
infinite path p G paths{TS) is said to be weakly S-fair if there are infinitely 
many states along p where no 5-transition is enabled or if there are infinitely 
many 5-transitions in p. The path is said to be strongly S-fair if there are only 
finitely many positions where an 5-transition is enabled or if there are infinitely 
many 5-transitions in p. 

We can relativize our notion of winning strategies to fair paths in the obvious 
way — 5 has a weakly (strongly) fair winning strategy for {II, x) if there is an 
5-strategy / such that every weakly (strongly) 5-fair path that begins at x and 
is consistent with / belongs to 77. 

3 The Logic 

3.1 Syntax 

Fix an alphabet X and a set of propositions Prop. We simultaneously define the 
sets FPE of finite path expressions, PE of (infinite) path expressions, and <P of 
state formulas as follows. We use e to denote a typical element of EPE, o; to 
denote a typical element of PE, and p to denote a typical element of 7>. 



e::=(/PG7>|aG77|e-e|e + e|e* 
a ::= e • | o; + o; 

p ::= tt I 7^ G Prop \ ^p \ pV p \ Es<x, (5 C X) \ As<x, (5 C X) 

Observe that finite path expressions are just regular expressions whose al- 
phabet may include state formulas. Similarly, path expressions are traditional 
cj- regular expressions [Tho90] built from finite path expressions. For state for- 
mulas, we can use ^ and V to derive the usual Boolean connectives such as A 
and =>. Finally, note that the wffs of Or^x are just the state formulas. 

3.2 Semantics 

OReX formulas are interpreted over A-labelled transitions systems equipped 
with valuations. Let TS = {X,X, — >) be a A-labelled transition system. A 
valuation v : X ^ specifies which propositions are true in each state. 
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We interpret finite path expressions over finite paths in TS', path expres- 
sions over infinite paths in TS', and state formulas at states of TS'. If p G 
finite_paths{T S) and e G FPE^ we write TS',p |= e to denote that e is satis- 
fied along p. Similarly, for p G paths{TS) and a G PE^ TS^p |= a denotes a is 
satisfied along p. Finally, for x G A and p G TS', x |= p denotes that p is 
satisfied at the state x. 

We define concatenation of paths in the usual way. Let p = xq — ^ x^ 

and p' = Xq be paths such that x^ = Xq. The path p • p' is the 

path xq Xn x^. The concatenation of a finite path p 

with an infinite path p is defined similarly. 

The three forms of the satisfaction relation |= are defined through simulta- 
neous induction as follows. 

Finite path expressions 

TS^p\= p iff p = xo and TS^xq |= p. 

TS^p \= o. iff p = Xq xi and ai = a. 

TS/p \=ei-e2 ifFp = Pi • p2, TS/pi \= ei and TS,p2 \= 62 

TS,p \= ei + 62 iS TS,p \= 6i or TS,p\= 62 

TS,p \= e* iff p = Xq or 

P = Pi • P2 • • -Pm and for i e {1,2, . . . ,m}, TS,pi \= e 

Path expressions 

TS', p 1 = Cl • 62 iff there is a prefix po • pi • P2 * * * of p such that 
TS.po 1 = ei and for i G {1,2, . . .}, TS,pi ^ 62 
TS', p ^ Oil + 0:2 iff TS', p ^ Oil 01 TS^p\= a2 



We can associate with each path expression a a set of infinite paths in 
TS in the obvious way — 77 ^ = {p G paths{TS) | TS', p ^ o;}. We let 77 -,^ denote 
the set paths{TS) \ 11 ^. 



State formulas 

TS^x 1= tt 
TS', X 1 = T G Prop 
TS, X 1= ^p 
TS,x ^ cpi V p 2 

TS, X 1= T^q; 

TS, X 1= As<x 



always. 

iff T G x(x). 

iff TS,x ^ p. 

iff TS, X 1= Pi or TS, X |= p 2 . 

iff S has a winning strategy for the game (77ct,x). 

iff S does not have a winning strategy for the game (77-,ct,x). 



It is useful to think of EsOi as asserting that S has a strategy to enforce a along 
every path. Conversely, As ex asserts that S does not have a strategy to avoid o; 
along every path — in other words, no matter what strategy S chooses, there is 
at least one path consistent with the strategy along which o; holds. 

In the absence of fairness constraints, the games we have described are deter- 
mined and EsP is logically equivalent to Ajj^sP- other words, we can derive 
one quantifier from the other. However, once we introduce fairness constraints, 
the games are no longer determined, in general, and the two quantifiers need to 
be introduced independently, as we have done here. However, even with fairness 
constraints, Esp implies Ajj^sP- 
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3.3 Examples 

Let e be a finite path expression. We shall use the following abbreviations for 
convenience. 

— The formula (respectively, abbreviates the formula (re- 

spectively, M^ett^). Every path in ilett^ has a finite prefix that satisfies the 
property e. 

— The formula Ese^ (respectively, abbreviates the formula (re- 

spectively, M^tte^). 

The operators E and A of logics like CTL* are just abbreviations for E^ and 
T/ 0 , respectively. Using these branching-time operators, we can specify some tra- 
ditional temporal properties as follows. Let the set of actions be {ui, U 2 , . . . , a^}. 
In the examples, E abbreviates the finite path expression (ai + U 2 + • • • + o^m)- 

— The LTL formula c/pZYt/; (c/p holds until holds) is captured by the path 
expression {(fE)* • t/;. Thus, the state formula A((pEy • asserts that 

is true of every computation path. 

— In LTL, the formula Op {p holds eventually) can be written as ttUp. 
The corresponding path expression is (ttU)* • p^ which we shall denote 
Eventually{p). 

— Let Invariant{p) denote the path expression {pE^ . The expression 
Invariant{p) asserts that the state formula p is invariant along the path, cor- 
responding to the LTL formula Up. Thus, the state formula A Invariant{p) 
says that along all paths p always holds. 

— The formula M((E'(U* • p) • E)^) asserts that along every path p is always 
attainable {Branching Liveness). This property has been used in the verifi- 
cation of the Futurebus protocol [CGH+95]. 

We can also assert properties that are not expressible in CTL* (or ATL*). 
For instance, A(Ea)^ asserts that along all paths, every even position is an a. 

Now, let us see how to use OReX to describe properties of a typical open sys- 
tem. Figure 1 shows a variant of the train gate controller described in [AHK97]. 
In this system, the atomic propositions are {innate, out_of_gate, waiting, admitted}. 
In the figure, the labels on the states indicate the valuation. The actions can be 
partitioned into two sets, train and c^r, corresponding to two components of the 
system, the train and the controller. The set train is given by 
{idle, request, relinquish, enter, leave} while ctr is given by {grant, reject, eject}. 
Here are some properties that one might like to assert about this system. 

— If the train is outside the gate and has not yet been granted permission to 
enter the gate, the controller can prevent it from entering the gate: 

A Invariant{{out-of-gate A ^admitted) ^ F'^^^/nuarmn^(out_of_gate)) 
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Fig. 1. A train gate controller 



— If the train is outside the gate, the controller cannot force it to enter the 
gate: 

A Invariant(out-of-gate A^^^/nmrmn^(-iin_gate)) 

This property can also expressed by the dual assertion that the train can 
choose to remain out of the gate. 

A Invariant(out-oLgate E^^^^^Invariant(-^i negate)) 

— If the train is outside the gate, the train and the controller can cooperate so 
that the train enters the gate: 

A Invariant(out-of-gate E Eventually{\n ^ate)) 

— If the train is in the gate, the controller can ensure that it eventually leaves 
the gate. 



A Invariant(\r\. gate E^fj,Eventually {out -of -gate)) 

Notice that this property is not satisfied by the system unless fairness is 
introduced — after entering the gate, the train may execute the action idle 
forever. However, since eject is continuously enabled at this state, even weak 
fairness suffices to ensure that the controller can eventually force the train 
to leave the gate. 

— Actually, it is unrealistic to assume that the controller can eject the train 
once it is in the gate. If we eliminate the transition labelled ejects we can 
make the following assertion which guarantees that as long as the train does 
not idle continuously, it eventually leaves the gate. 

A Invariant(\r\-gate {^{idle)^ Ev entually {out -of -gate)) 

To state this property, we need to integrate assertions about actions and 
states in the formula. This formula cannot be conveniently expressed in 
ATL*, where formulas can only refer to properties of states. 
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4 The Model-Checking Algorithm 



We now describe an automata-theoretic model checking algorithm for OReX 
formulas. As we remarked earlier, path expressions can be regarded as cj-regular 
expressions over the infinite alphabet consisting of the finite set of actions U 
together with the set of all state formulas. However, if we fix a finite set of state 
formulas we can regard each path expression which does not refer to state 
formulas outside ^ as an cj-regular expression over the finite alphabet We 

can associate with each such path expression a a language L(a) of infinite words 
over S using the standard interpretation of cj-regular expressions [Tho90]. 

Unfortunately, this na’ive translation does not accurately reflect the meaning 
of path expressions: L(o;), the complement of L(o;), may contain sequences that 
are models of a. For instance, if L{a) contains aipiLp 2 ^^ but does not contain 
the second path will be present in L{a) though it models o;. We require 
more structure in the infinite sequences we associate with path expressions to 
faithfully translate logical connectives into automata-theoretic operations. 

We begin by defining an alternative model for finite path expressions and 
path expressions in terms of sequences over an extended alphabet. Let ^ be 
a finite set of state formulas. A ^-decorated sequence over A is a sequence in 
which subsets of ^ alternate with elements of U. More precisely, a ^-decorated 
sequence over U is an element of (2^ • A)*(2^) U (2^ • A)^. We use s to denote 
a finite ^F-decorated sequence and a to denote an arbitrary (finite or infinite) 
^-decorated sequence. A finite ^-decorated sequence s can be concatenated with 
a ^-decorated sequence a if the last element of s is identical to the first element 
of a. The concatenation is obtained by deleting the last element of s. 

We denote the set of finite path expressions and path expressions that do 
not refer to any state formulas outside ^ as FPEq^ and PEq^^ respectively. We 
can interpret expressions from FPEq^ and PEq/ over ^F-decorated sequences. The 
satisfaction relation is defined as follows: 



s\=V> 

s \= a 

5 1= ei • 62 
s 1= 6i + 62 
S 1= 6* 



a 1= 6i • 6^ 
a 1= 6t;i + a2 



iff s = Ao and (/? G Aq 
iff s = AoaAi 

iff s = Si • S2, Si 1= 6i and S2 |= 62 

iff s 1= 61 or s 1= 62 

iff s = Aq or s = Si • S2 • • • Sn 

and for i G {1, 2, . . . , n}, s^ |= 6 
iff there is a prefix sq • si • S 2 • • • of a such that 
So 1= 61 and for i G {1, 2, . . .}, si ^ 62 
iff a 1 = a\ or a |= a 2 



Let TS = (A, A, — >) be a A-labelled transition system and v : X ^ Prop be 
a valuation over TS. For each path p = xq xi . . . in TS', the corresponding 
T-decorated sequence ap is AqUiAi . . . where A^ = {c/p | (/? G T and TS, Xi |= p}. 
Then, for any path expression a G PExj/-, TS^p\= a \i and only if o-p \= a. 

There is a natural connection between the language L{a) defined by ct; G PE^ 
and the semantics of ct; in terms of T-decorated sequences. To formalize this, we 
need to define when a T-decorated sequence embeds a sequence over A' U T. 
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Embedding Let w = wqWi ... be an infinite word over A U ^ and a = a^ai .. . 
be a ^-decorated sequence. Observe that C ^ if i is even and G A if i 
is odd. We say that a embeds w if there is a monotone function / : No ^ No 
mapping positions of w into positions of a such that: 

(i) If i is in the range of j < i and j is odd, then j is in the range of /. 

(ii) If Wi G U then Wi = 

(iii) If Wi G 4^ then wi G o~f(^iy 

We can then establish the following connection between L(a) and the set of 
^^-decorated sequences that model o;. 

Lemma 4.1. A ^ -deeorated sequenee a models a path expression o; G PEq/ if 
and only if a embeds at least one of the sequenees in the language L{a) over the 
alphabet S UE. 

Let I a I denote the number of symbols in a path expression a. From the theory 
of cj-regular languages [Tho90] , we know that for each path expression o; G PEq/ , 
we can construct a nondeterministic Biichi automaton Ala over SiJP whose size 
is polynomial in |o;| and whose language is precisely L{a). 

Using our notion of embedding, we can associate with each Buchi automaton 
Al over E\JE an extended language L+(Al) of ^F-decorated sequences as follows: 
L+{A) = { a I 3w G E{A) : a embeds w} 

Lemma 4.2. Eor any Bilehi automaton A over EUE, we ean eonstruet a Bilehi 
automaton A~^ over E U 2^ sueh that Al“*" reads elements from E and 2^ alter- 
nately, L+(AI) = L(AI"^) and the number of states of A~^ is polynomial in the 
number of states of A. 

Proof Sketch: Let Al = (Q,^, be a given nondeterministic Biichi au- 

tomaton over Sue. For states p,q ^ Q and U CE, we write p ^ q if there is a 
sequence of transitions from p ending in q whose labels are drawn from the set 
U . We write p g to indicate that there is a sequence of transitions from p to 
g that passes through a state from F, all of whose labels are drawn from the set 
U. We write p if there is an accepting run starting at p, all of whose labels 
are drawn from the set U . 

The automaton Al+ = (Q+, U U 2^, F'+, go) where: 

Q+ = {Qx{0,l,2})U{qf,pf} 

F+ = (Qx{2})u{p/} 

= {((p.o)i V('ja)) \ p ^ q} 'J {iip,o),u,{q,2)) \p^pq}'J 
{i(p,o),u,qf) \p^a}^ 

{{{p,l),a,{q,0)),{{p,2),a,{q,0)) \ (p,a,q) G (^}U 
{iiP,0)X,{pA)) \p&Q,U CF}U {{qf,a,pf) \ a G 
{{PfX,qf)\UCF} 



□ 
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The construction allows us to convert a path expression o; G PE^ into an 
automaton over EU 2^ whose state space is polynomial in |o;|. Notice that 
the alphabet, and hence the number of transitions, of is exponential in |o;|. 

The lemma assures us that if we complement we obtain an automaton 
that accepts precisely those ^-decorated sequences that do not model o;. As 
we remarked earlier, this is not true of the original automaton Aq,: L{a) may 
contain sequences that can be embedded into ^F-decorated sequences that model 
q;. In a sense, L'^{A) can be thought of as the semantic closure of L{A). 

Lemma 4.3 ([Saf88, EJ89]). A Bilchi automaton A can he effectively deter- 
minized (complemented) to get a deterministic Rabin automaton whose size is 
exponential in the number of states of A and linear in the number of transitions 
of A and whose set of accepting pairs is linear in the number of states of A. This 
automaton can be constructed in time exponential in the number of states of A 
and polynomial in the number of transitions of A. 

Lemma 4.4 follows from the previous two lemmas. 

Lemma 4.4. From a path expression o; G PExi/, we can construct a deterministic 
^ 

Rabin automaton A^ (^a) accepts exactly those T -decorated paths that do 

^ 

(do not) model a. The number of states of A^ (^a) exponential in |o;| and 
the number of accepting pairs of A^ (^a) polynomial in |o;|. 

Theorem 4.5. Given a transition system TS, a state x in TS and an OReX 
formula (f, the model checking problem TS,x\= can be solved in time 0{{mk)^), 
where m is given by 2^*^ 1^1^ • \TS\ and k is a polynomial in \(f\. 

Proof Sketch: We use the bottom- up labelling algorithm of CTL* and ATL*. 
The only interesting case is when the formula (p to be checked is of the form 
Escx (or A^q;). Each S'-strategy / at a state x can be represented as a tree 
by unfolding TS starting at x, retaining all E \ S edges in the unfolding, but 
discarding all S'-transitions that are not chosen by the strategy. Conversely, if 
we unfold TS' at X and prune S-transitions while ensuring that at least one S- 
transition is retained at each state y where enabled{y) fl — >s A 0? we obtain 
a representation of some S-strategy at x. 

Let T be the set of state formulas that appear in o;. We may assume that 
each state of the transition system TS is labelled with the formulas from T that 
hold at that state. In the trees that we use to represent strategies, each state is 
labelled by the same subset of formulas as the corresponding state in T S. 

The trees that represent strategies carry labels on both the states and the 
edges. We can convert these to state-labelled trees by moving the label on each 
edge to the destination state of the edge. 

Given a transition system TS', we can construct a Biichi tree automaton Tts 
that accepts precisely those trees labeled by A U 2^ that arise from strategies, 
as described in [KV96]. The size of Tts is linear in the size of TS. 

An S-strategy / is a witness for TS, x |= Escx if and only if every infinite 
path in the tree corresponding to / models a. Equivalently, every infinite path 
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in this tree, when treated as a sequence over A U 2*^, is accepted by the Rabin 
automaton constructed from as described in Lemma 4.4. Since A^ is 
deterministic, we can convert it into a Rabin tree automaton 7^ that runs A^ 
along all paths and accepts exactly those trees in which all infinite paths model 
a. The automaton has the same size as A ^ — that is, the number of states is 
exponential in |o;| and the number of accepting pairs is polynomial in \a\. 

Finally, we construct the product of Tts and 7^ to obtain a tree automaton 
that accepts a tree if and only if it corresponds to a strategy / such that all 
infinite paths in TS' consistent with / satisfy a. We can then use the algorithm 
of Emerson and Jutla [EJ88] to check whether this product automaton accepts a 
nonempty set of trees. The Emerson-Jutla algorithm solves the emptiness prob- 
lem for a tree automaton with i states and j accepting pairs in time 
Erom this, we can derive that the emptiness of the product tree automaton 
we have constructed can be checked in time 0((mA:)^), where m is given by 

2^(1*^ I) • \TS\ while A: is a polynomial in \a\. 

The case is similar to with one modification — use the construction 

due to Emerson and Jutla [EJ89] instead of Safra’s construction to obtain a 

^ 

deterministic Rabin automaton A^ that accepts exactly those sequences over 
E U 2*^ that do not model a. 

In the constructions above, we have not taken into account the case when 
we are interested only in (weakly or strongly) E-fair computations. Eairness 
constraints can be handled by adding a simple Rabin condition to the automaton 
Tts that accepts all trees corresponding to E-strategies. This extra structure 
does not materially affect the overall complexity of the procedure. 

□ 



5 Related Work 

In [AHK97], the logic ATL* is interpreted over two kinds of transition systems: 
synchronous structures and asynchronous structures. In each case, the notion of 
agent plays a prominent role in the definition. In a synchronous structure^ each 
state is associated with a unique agent. An important subclass of synchronous 
structures is the one with two agents, one representing the system and the other 
the environment. Such systems have been studied by Ramadge and Wonham 
[RW89] in the context of designing controllers for discrete-event systems. 

In an asynchronous structure^ each transition is associated with an agent. 
A synchronous structure can be thought of as a special kind of asynchronous 
structure where all transitions out of a state are associated with a single agent. 
Our A-labelled transition systems correspond to asynchronous structures, where 
the set of agents corresponds to the alphabet L\ 

The notion of strategy described in [AHK97] is slightly different from the one 
we use here. In their framework, each agent has a strategy and an E-strategy 
at a state allows any move permitted by all the individual strategies of the 
agents in E. Our definition of global strategy is more generous. We believe that 
decompositions of strategies where each agent has a view of the global state 
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do not arise naturally and have therefore chosen not to incorporate this into 
our logic. Nevertheless, we can easily simulate the framework of [AHK97] in our 
setup without changing in the complexity of the construction. 

In the Introduction, we mentioned that OReX subsumes ATL* in expressive 
power since cj-regular expressions are more powerful than LTL. We also note 
that in the context of open systems, it is natural to refer to properties of both 
states and actions in a transition system. If we use LTL to express properties of 
paths, as is the case with ATL*, we have to encode actions in the states of the 
system since LTL can only talk about properties of states. On the other hand 
OReX provides a seamless way of interleaving assertions about actions and states 
along a path. This makes for more natural models of open systems — for instance, 
compare our example of the train gate controller with explicit action labels with 
the version in [AHK97] where the action labels are encoded as propositions. 

There has been earlier work on game logics that mention actions. In [Par83], 
Parikh defines a propositional logic of games that extends propositional dynamic 
logics [Har84]. The logic studied in [Par83] is shown to be decidable. A complete 
axiomatization is provided, but the model- checking problem is not studied. 

6 Conclusions 

As Theorem 4.5 indicates, the mo del- checking complexity of OReX is at least 
one exponential better than that of ATL*, though OReX subsumes ATL* in 
expressive power. If we restrict the quantifiers of OReX to the normal branching- 
time interpretation A and we obtain a system called BReX that corresponds 
to the branching- time logic ECTL*. The mo del- checking algorithm for BReX is 
exponential, which is comparable to the complexity of model-checking CTL*. 

Why does going from CTL* to ATL* add an exponential to the complexity of 
mo del- checking, unlike the transition from BReX to OReX? The construction in 
[VW86] can be used to build nondeterministic Biichi automata for LTL formulas 
a and with exponential blow-up. Thus the complexity of model-checking 
CTL* stays exponential. When we translate BReX path formulas into automata 
we only incur a polynomial blowup in size, but we may have to complement the 
automaton for a to get an automaton for ^a. Complementation is, in general, 
exponential and, consequently, model-checking for BReX is also exponential. 

When going from CTL* to ATL*, we need to determinize the Biichi automa- 
ton constructed from the LTL formula a so that we can interpret the automaton 
over trees. This blows up the automaton by another exponential, resulting in 
a mo del- checking algorithm that is doubly exponential in the size of the input 
formula. In OReX the automaton we construct for path formulas is already 
determinized, so we avoid the second exponential blow-up. 

Finally, we note that if we restrict our syntax to only permit the existential 
path quantifier Ea^ we can model-check BReX in polynomial time — essentially, 
it suffices to construct a nondeterministic automaton for a. On the other hand, 
the model-checking problem for OReX remains exponential even with this re- 
striction because we have to determinize the automaton for a. Since cj-regular 
languages are closed under complementation, we can reduce any formula in BReX 
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or Or^x to one with just existential path quantifiers. However, in the process, we 
have to complement cj-regular expressions. Since this is, in general, an exponen- 
tial operation, the cost of this translation is exponential (perhaps nonelement ary, 
since nested negations introduce nested exponentials). Thus, the full language 
that we have presented here permits more succinct specifications and more effi- 
cient verification than the reduced language with just existential quantifiers. 
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Abstract. Inheritance is a characteristic reasoning mechanism in sys- 
tems with taxonomic information. In rule-based deductive systems with 
inclusion polymorphism, inheritance can be captured in a natural way by 
means of typed substitution. However, with method overriding and mul- 
tiple inheritance, it is well-known that inheritance is nonmonotonic and 
the semantics of inheritance becomes problematical. We present a gen- 
eral framework, based on Dung’s abstract theory of argumentation, for 
developing a natural semantics for declarative programs with dynamic 
defeasible inheritance. We investigate the relationship between the pre- 
sented semantics and Dobbie and Topor’s perfect model (with overriding) 
semantics, and show that for inheritance-stratified programs, the two se- 
mantics coincide. The proposed semantics, nevertheless, still provides the 
correct skeptical meanings for non-inheritance-stratified programs, while 
the perfect model semantics fails to yield sensible meanings for them. 



1 Introduction 

One of the most salient features associated with generalization taxonomy is 
inheritance. In logic-based deduction systems which support inclusion polymor- 
phism (or subtyping), inheritance can be captured in an intuitive way by means 
of typed substitutions. To illustrate this, suppose that tom is an individual of 
type student. Then, given a program clause: 

Cl : X: student[residence ^ east-dorm] ^ X[lives-in ^ rangsit-campus], 

X[sex ^ male], 

which is intended to state that for any student X, if X lives in rangsit-campus and 
X is male, then X’s residence place is east-dorm; one can obtain by the application 
of the typed substitution {X: student/tom} to Cl the ground clause: 
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G1 : tom [residence ^ east-dorm] ^ tom[lives-in ^ rangsit-campus], 

tom [sex ^ male]. 

The clause Cl can naturally be considered as a conditional definition of the 
method residence associated with the type (class student and the clause G1 as 
a definition of the same method for tom inherited from the type student. 

However, when a method is supposed to return a unique value for an ob- 
ject, definitions of a method inherited from different types, tend to conflict. For 
example, suppose that tom is also an individual of type employee and a clause: 

C2 : X: employee[residence ^west-flats] ^ X[lives-in ^ rangsit-campus], 

X[marital-status ^ married], 

defining the method residence for an employee is also given. Then, the definition 
of residence for tom obtained from (72, ie., 

(72: tom [residence ^ west-flats] ^ tom [lives-in ^ rangsit-campus], 

tom [marital-status ^ married], 

conflicts with the previously inherited definition (71 when they are both ap- 
plicable. In the presence of such conflicting definitions, the usual semantics of 
definite programs, e.g.^ the minimal model semantics, does not provide satis- 
factory meanings for programs; for example, if a program has both (71 and (72 
above as its ground instances, then, whenever its minimal model entails each 
atom in the antecedents of (71 and (72, it will entail the conflicting information 
that tom’s residence place is east-dorm and is west-flats. 

In order to provide appropriate meanings for programs with such conflicting 
inherited definitions, a different semantics that allows some ground clauses whose 
antecedents are satisfied to be inactive is needed. This paper applies Dung’s the- 
ory of argumentation [6] to the development of such a semantics. To resolve 
inheritance conflicts, the proposed approach requires a binary relation on pro- 
gram ground clauses, called the domination relation^ which determines among 
possibly conflicting definitions whether one is intended to defeat another. For 
example, with additional information that students who are also employees usu- 
ally prefer the accommodation provided for employees, (72 is supposed to defeat 
(71. With such a domination relation, a program will be transformed into an 
argumentation framework, which captures the logical interaction between the 
intended deduction and domination; and, then, the meaning of the program will 
be defined based on the grounded extension of this argumentation framework. 

Using this approach, conflict resolution is performed dynamically with re- 
spect to the applicability of method definitions. That is, the domination of one 
method definition over another is effective only if the antecedent of the domi- 
nating definition succeeds. The appropriateness of dynamic method resolution in 
the context of deductive rule-based systems, where method definitions are often 
conditional and may be inapplicable to certain objects, is advocated by [1]. In 
particular, with the possibility of overriding, when the definitions in the most 

In this paper, the terms “type” and “class” are used interchangeably. 
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specific type are inapplicable, it is reasonable to try to apply those in a more 
general type. 

In order to argue for the correctness and the generality of the proposed seman- 
tics in the presence of method overriding, its relationship to the perfect model 
(with overriding) semantics proposed by Dobbie and Topor [5] is investigated. 
The investigation reveals that these two semantics coincide for inheritance- 
stratified programs. Moreover, while the perfect model semantics fails to pro- 
vide sensible meanings for programs which are not inheritance-stratified, the 
presented semantics still yields their correct skeptical meanings. 

For the sake of simplicity and generality, this paper uses Akama’s axiomatic 
theory of logic programs [4], called DP theory (the theory of declarative pro- 
grams), as its primary logical basis. The rest of this paper is organized as follows. 
Section 2 recalls some basic definitions and results from Dung’s argumentation- 
theoretic foundation and DP theory. Section 3 describes the proposed seman- 
tics. Section 4 establishes the relationship between the proposed semantics and 
the perfect model (with overriding) semantics. Section 5 discusses other related 
works and summarizes the paper. 

2 Preliminaries 

2.1 Argumentation Framework 

Based on the basic idea that a statement is believable if some argument sup- 
porting it can be defended successfully against attacking arguments. Dung has 
developed an abstract theory of argumentation [6] and demonstrated that many 
approaches to nonmonotonic reasoning in AI are special forms of argumentation. 
In this subsection, the basic concepts and results from this theory are recalled. 

Definition 1. An argumentation framework is a pair {AR^ attacks)^ where AR 
is a set and attacks is a binary relation on AR. □ 

In the sequel, let AF = {AR^ attacks) be an argumentation framework. The 
elements of AR are called arguments. An argument a G AR is said to attaek 
an argument b G AR, iff (a, 6) G attacks. Let B C AR. B is said to attack an 
argument b G AR, iff some argument in B attacks b. An argument a G AR is 
said to be acceptable with respect to B, iff, for each b G AR, if b attacks a, then 
B attacks b. B is said to be conflict-free, iff there do not exist arguments a,b E B 
such that a attacks b. B is said to be admissible, iff B is conflict-free and every 
argument in B is acceptable with respect to B. 

The credulous semantics and the stable semantics of AF are defined by the 
notions of preferred extension and stable extension, respectively: 

Definition 2. A preferred extension of AF is a maximal (with respect to set 
inclusion) admissible subset of AR. A set A C AR is called a stable extension of 
AF, iff A is conflict-free and A attacks every argument in AR — A. □ 
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To define the grounded (skeptical) semantics of AF (Definition 3), the func- 
tion Faf on 2"^^, called the characteristic function of AF, is defined by: 

Faf{X) = {a I a is acceptable with respect to X}. 

Clearly, Faf is monotonic (with respect to C), and, thus, has the least fixpoint. 

Definition 3. The grounded extension of AF is the least fixpoint of Faf- D 

The next example illustrates the three kinds of extensions. 

Example 1. Let AF = {AR, attacks), where AR = {a, 6, c, d, e} and attacks = 
{(a, b), (b, c), (d, e), (e, d)}. Then, AF has two preferred extensions, i.e., {a, c, d} 
and {a, c, e}, which are also stable extensions. As Faf{^) = {ci} and F^^(0) = 
{a, c} = F\p{$), the grounded extension of AF is {a, c}. □ 

Well-foundedness of an argumentation framework, recalled next, is a sufficient 
condition for the coincidence between the three kinds of extensions. 

Definition 4. AF is well-founded, iff there exists no infinite sequence of argu- 
ments ao, ai, . . . , an, • • • such that for each i > 0, a^+i attacks a^. □ 



Theorem 1. If AF is well-founded, then it has exactly one preferred extension 
and one stable extension, each of which is equal to its grounded extension. □ 



2.2 DP Theory 

DP theory [4] is an axiomatic theory which purports to generalize the concept 
of conventional logic programs to cover a wider variety of data domains. As an 
introduction to DP theory, the notion of a specialization system is reviewed first. 
It is followed by the concepts of declarative programs and their minimal model 
semantics on a specialization system. 

Definition 5. A specialization system is a 4-tuple {A, Q,S,jj) of three sets A, Q 
and S, and a mapping /x from S to partial .map{^A) {i.e., the set of all partial 
mappings on A), that satisfies the conditions: 

1. (Vs, s' e S){3s" e S) : fas" = (/xs') o (/xs), 

2. {3s e 5)(Va G A) : (/xs)a = a, 

3. g c A. □ 

In the rest of this subsection, let F = {A, Q, S, /x) be a specialization system. The 
elements of A are called atoms; the set Q is called the interpretation domain; 
the elements of S are called specialization parameters or simply specializations; 
and the mapping /x is called the specialization operator. A specialization s £ S 
is said to be applicable to a G Vl, iff a G dom{fis). By formulating a suitable spe- 
cialization operator together with a suitable set of specialization parameters, the 
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typed- substitution operation can be regarded as a special form of specialization 
operation. 

Let X be a subset of A. A definite clause C on X is a formula of the form 
(a ^ ^ 1 , . . . , bn)^ where n > 0 and a, 6i, . . . , 6^ are atoms in X. The atom a 
is denoted by head{C) and the set {6i , ... An} by Body{C). When n = 0, C is 
called a unit clause. A definite clause C' is an instance of (7, iff there exists s E S 
such that s is applicable to a, 6i, . . . , bn and C' = {{jas)a ^ (/is)6i, . . . , {jas)bn)- 
A definite clause on Q is called a ground clause. A declarative program on T is a 
set of definite clauses on A. Given a declarative program P on T, let Gclause(P) 
denote the set of all ground instances of clauses in P. Conventional (definite) 
logic programs as well as typed logic programs can be viewed as declarative 
programs on some specialization systems. 

An interpretation is defined as a subset of Q. Let I be an interpretation. 
If (7 is a definite clause on then I is said to satisfy C iff {head{C) G I) or 
{BodyiC) 2 /). If (7 is a definite clause on 71, then I is said to satisfy C iff 
for every ground instance (7' of (7, I satisfies (77 / is a model of a declarative 
program P on T, iff / satisfies every definite clause in P. The meaning of P is 
defined as the minimum model of P, which is the intersection of all models of P. 



3 The Proposed Semantics 

In the sequel, let P = (71, <S,/i) be a specialization system and P a declarative 

program on P. Let dominates be a binary relation on Gclause{P). A ground 
clause (7 of P is said to dominate another ground clause G' of P, iff ((7,(7') G 
dominates. It will be assumed henceforth that the relation dominates prioritizes 
the ground clauses of P; more precisely, for any ground clauses (7, (7' of P, (7 
dominates (7', iff G is preferable to G' and whenever BodyiC) is satisfied, G' 
will be inactive. It should be emphasized that the domination of a ground clause 
G over another ground clause (7' is intended to be dynamically operative with 
respect to the applicability of (7, z.e., the domination is effective only if the 
condition part of G is satisfied. The relation dominates will also be referred to 
as the domination relation of P. 



3.1 Derivation Trees 

The notion of a derivation tree of a program will be introduced first. A derivation 
tree of P represents a derivation of one conclusion from P. It will be considered 
as an argument that supports its derived conclusion. Every conclusion in the 
minimum model of P is supported by at least one derivation tree of P. 

Definition 6. A derivation tree of P is defined inductively as follows: 

1. If (7 is a unit clause in Gclause{P)^ then the tree of which the root is G and 
the height is 0 is a derivation tree of P. 
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Fig. 1. The derivation trees of the program Pi. 



2 . If C = (a ^ 61,..., 6n) is a clause in Gclause{P) such that n > 0 and 
Ti, . . . , Tn are derivation trees of P with roots Ci, . . . , Cn, respectively, such 
that head{Ci) = 6^, for each i G {1, . . . , n}, then the tree of which the root 
is C and the immediate subtrees are exactly Ti, . . . ,T^ is a derivation tree 
of P. 

3. Nothing else is a derivation tree of P. □ 

Example 2. Let Pi be a declarative program comprising the five ground clauses: 
a ^ b ^ c ^ a d ^ c^b f ^ e 

Then, Pi has exactly four derivation trees, which are shown by Figure 1. Note 
that the derivation trees Ti,T2,T3 and T4 in the figure depict the derivation of 
the conclusions a, 6, c and d, respectively. □ 

In the sequel, the root of a derivation tree T will be denoted by root{T). A 
derivation tree T will be regarded as an argument that supports the activation of 
the ground clause root{T) (and, thus, supports the conclusion head{root{T))). 

3.2 Grounded- Extension-Based Semantics 

In order to define the meaning of P with respect to the domination relation, the 
program P will be transformed into an argumentation framework AP^(P), which 
provides an appropriate structure for understanding the dynamic interaction of 
the deduction process of P and the specified domination relation. Intuitively, one 
argument (derivation tree) attacks another argument (derivation tree), when the 
ground clause supported by the former dominates some ground clause used in 
the construction of the latter. 

Definition 7. The argumentation framework AF^(P) = (AR^ attacks) is de- 
fined as follows: AR is the set of all derivation trees of P, and for any T, T' G AP, 
T attacks T', iff root{T) dominates some node of T'. □ 

Example 3. Referring to the program Pi of Example 2, suppose that the 
ground clause a ^ dominates the ground clause b and for any other two 
ground clauses in Pi, one does not dominate the other. Then AP^(Pi) = 
{ARp^^ attacks)^ where ARp^ consists of the four derivation trees in Figure 1 
and attacks = {(Ti, T2), (Ti, T4)}. (Note that Ti attacks T4 as the root of Ti 
dominates the right leaf of T4.) □ 
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Fig. 2. The argumentation framework for the program P 2 . 



The meaning of P is now defined as the set of all conclusions which are 
supported by some arguments in the grounded extension of AF^{P). 

Definition 8 . The grounded- extension-based meaning of P, denoted by 
is defined as the set {head{root{T)) \ T G GP}, where GE is the grounded 
extension of AF^{P). □ 

Four examples illustrating the proposed semantics are given below. 

Example Consider AP^(Pi) of Example 3. Let E be the characteristic function 
of AE,{Pi). Clearly, P(0) = {Ti,T 3 } = P(P(0)). Thus P(0) is the grounded 
extension of AP;^(Pi), and, then, = {a, c}. □ 



Example 5. Let a declarative program P 2 comprise the six ground clauses: 

a ^ b ^ c ^ d ^ a e ^b 

Let d ^ a dominate b ^ and e ^ b dominate f ^ and assume that 
for any other two ground clauses in P 2 , one does not dominate the other. 
Then AE^{P 2 ) = {ARp^^ attacks)^ where ARp^ consists of the six derivation 
trees shown in Figure 2 and attacks = {(Tg, Te, ), (Tg, Tg), (Tg, Tio)} as de- 
picted by the darker arrows between the derivation trees in the figure. Let 
E be the characteristic function of AE^{P 2 ). Then P(0) = {T 5 ,T 7 ,Tg}, and 
p2(0) = {T 5 ,T 7 ,Tg,Tio} = P^(0). So = {a,c,d,/}. This example also 

illustrates dynamic conflict resolution, z.e., the domination of the ground clause 
e ^ b over the ground clause f ^ c does not always prevent the activation of 
the latter. □ 



Example 6. Refer to the clauses G1,G2,G1 and G 2 given at the beginning of 
Section 1. Let tom belong both to type student and to type employee. Consider 
a program P 3 comprising Gl, G 2 and the following three clauses: 

G3 : tom[lives-in ^ rangsit-campus] ^ 

G4 : tom [sex ^ male] ^ 

G5 : tom [marital-status ^ married] ^ 

Assume, for simplicity, that Gl and G 2 have Gl and G 2 , respectively, as their 
only ground instances. Suppose that students who are also employees prefer the 
accommodation provided for employees, and, then, that G 2 dominates Gl. Then, 
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it is simple to see that contains tom [residence ^ west-flats] but does not 

contain tom [residence ^ east-dorm], and yields the desired meaning of P3. 

To demonstrate dynamic conflict resolution, suppose next that the clause (75 
is removed from P 3 . Then, instead of containing tom [residence ^ west-flats], 
contains tom [residence ^ east-dorm]; and, it still provides the correct meaning 
of P 3 in this case. □ 



Example 1. This example illustrates method overriding. Let ait be an instance of 
type int(ernational)-school and int-school be a subtype of school. Let a program 
P 4 comprise the following three clauses: 

X: school[medium-of-teaching ^ thai] ^ X[located-in ^ thailand] 

X: int-school [medium-of-teaching ^ english] ^ 

ait[located-in ^ thailand] ^ 

For the sake of simplicity, assume that P 4 has only three ground clauses: 

(73 : ait[medium-of-teaching ^ thai] ^ ait[located-in ^ thailand] 

(74 : ait[medium-of-teaching ^ english] ^ 

(75 : ait[located-in ^ thailand] ^ 

Since int-school is more specific than school, (74 is supposed to override (73; 
therefore, let (74 dominate (73. It is readily seen that is the set consisting 
of the two atoms ait[located-in ^ thailand] and ait[medium-of-teaching ^ english], 
which is the expected meaning of P 4 . □ 

4 Perfect Model (with Overriding) Semantics 

Bobbie and Topor defined a deductive object-oriented language called Gulog 
[5], in which inheritance is realized through typed substitutions, and studied 
the interaction of deduction, inheritance and overriding in the context of this 
language. The declarative semantics for Gulog programs is based on Przymusin- 
ski’s perfect model semantics for logic programs [ 11 ], but using the possibility of 
overriding instead of negation in defining a priority relationship between ground 
atoms. The perfect model (with overriding) semantics provides the correct mean- 
ings for inheritance-stratified programs. In order to investigate the relationship 
between this semantics and the grounded-extension-based semantics, the notions 
of inheritance stratification and perfect model will be reformulated in the frame- 
work of DP theory in Subsection 4.1. The relationship between the two kinds of 
semantics will then be discussed in Subsection 4.2. 

4.1 Inheritance-Stratified Programs and Perfect Models 

According to [5], a program is inheritance-stratified if there is no cycle in any 
definition of a method, z.e., a definition of a method does not depend on an 
inherited definition of the same method. More precisely: 
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Definition 9. A declarative program P on is said to be inheritance- stratified^ 
iff it is possible to decompose the interpretation domain Q into disjoint sets, 
called strata^ Go, Gi, . . . , G 7 , . . where j < S and ^ is a countable ordinal, such 
that the following conditions are all satisfied. 



1. For each G G Gclause{P), if head{C) G G^, then 

(a) for each b e Body{C), b e U/?<a G/3, 

(b) for each C e Gdause{P) such that C dominates C, 

i. head{C) e U/3<a Gp, 

ii. for each b' e Body{C'), b' e U/3<a G/^. 

2. There exists no infinite sequence Go, Gi, . . . , Gn, • 
such that for each i > 0 , G^+i dominates G^. 



. of clauses in Gclause{P) 



Any decomposition {Go, Gi, . . . , G^, . . .} of ^ satisfying the above conditions is 
called an inheritance stratification of P. □ 



An example of non-inheritance-stratified programs will be given in Subsec- 
tion 4.2 (Example 8 ). The next theorem illuminates the coincidence between 
the grounded extension, preferred extension and stable extension of the argu- 
mentation framework for an inheritance-stratified program (see Theorem 1 in 
Subsection 2 . 1 ). Its proof can be found in the full version of this paper [10]. 

Theorem 2. If P is inheritance- stratified^ then AFfiP) is well-founded. □ 

With overriding, not every ground clause of a program is expected to be 
satisfied by a reasonable model of that program. More precisely, a ground clause 
need not be satisfied if it is overridden by some ground clause whose premise is 
satisfied. This leads to the following notion of a model with overriding: 

Definition 10. An interpretation / is a model with overriding (for short, o- 
model) of P, iff for each G G Gclause(P)^ either I satisfies G or there exists 
G' G Gclause{P) such that G' dominates G and Body{G') CL □ 

A program may have more than one o-model. Following [5], priority relations 
between ground atoms are defined based on the possibility of overriding. 



Definition 11. Priority relations <p and <p on Q are defined as follows: 

1. If G G Gclause{P), then 

(a) for each h G Body{G)^ head{G) <p 6, 

(b) for each G' G Gclause{P)^ if G' dominates G, then 

i. head{G) <p head{G')^ 

ii. for each h' G Body{C'), head{G) <p P, 

2 . If a <pb and b <p c, then a <p c, 

3. If a b and b <p c (respectively, d <p a), then a <p c (respectively, d <p 6 ), 

4. If a <p 6 , then a <p 6 , 

5. Nothing else satisfies <p or <p. □ 

A preference relationship among o- models will then be defined based on the 
priority relation <p. 
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Definition 12. Let M and N be o-models of P. M is said to be preferable to 
TV, in symbols, M <C TV, iff TIL 7 ^ TV and for each a G M — N ^ there exists 
b e N — M such that a <p b. M is said to be a perfeet o-model of P, iff there 
exists no o-model of P preferable to TIL. □ 

Every inheritance-stratified program P has exactly one perfect o-model,^ 
denoted by which provides the correct meaning of P with respect to 

method overriding. 



4.2 Relationship between the Proposed Semantics and Perfect 
Model (with Overriding) Semantics 

It is shown in the full version of this paper [10] that: 

Theorem 3. If P is inheritance- stratified and the domination relation is tran- 
sitive, then □ 

It is important to note that since the domination due to method overriding is 
typically transitive, the transitivity requirement does not weaken Theorem 3. 

For programs that are not inheritance-stratified, the perfect model semantics 
fails to provide their sensible meanings, while the proposed semantics still yields 
their correct skeptical meanings. (The skeptical approach to method resolution 
discards all conflicting definitions.) This is demonstrated by the next example. 



Example 8. Let tom be an instance of type gr(aduate)-student and gr-student 
is a subtype of student. Consider the declarative program P 5 comprising the 
following five clauses: 



C 6 : 


X: student[math-ability ^ good] 


^ X[math-grade ^ b] 


C7: 


X: student[major ^ math] 


^ X[math-ability ^ good], 
X[favourite-subject ^ math] 


C 8 : 


X: gr-student[math-ability ^ average] 


^ X[major ^ math], 
X[math-grade ^ b] 


C9 : 


tom [math-grade ^ b] ^ 




CIO : 


tom [favourite-subject ^ math] ^ 




Without loss of generality, suppose for simplicity that (76, (77 and (78 have as 
their ground instances only the clauses (76, (77 and (78, given below, respectively: 


G 6 : 


tom[math-ability ^ good] ^ 


tom [math-grade ^ b] 


G7: 


tom [major ^ math] ^ 


tom[math-ability ^ good], 
tom [favourite-subject ^ math] 


G 8 : 


tom[math-ability ^ average] ^ 


tom [major ^ math]. 



tom [math-grade ^ b] 



^ This result is analogous to and inspired by the corresponding result for inheritance- 
stratified Gulog programs [5]. Its proof is given completely in [9]. 
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The ground clauses G 6 and G8 are considered as definitions of the method 
math-ability inherited from the types student and gr-student, respectively. As gr- 
student is more specific than student, G8 is supposed to dominate G 6 . Then, every 
inheritance stratification of P 5 requires that the ground atom tom [major ^ math] 
must be in a stratum which is lower than the stratum containing it, which is a 
contradiction. Hence P 5 is not inheritance-stratified. 

Observe that G 8 dominates G 6 , but G 8 also depends on G 6 ; that is, the 
activation of G 6 results in the activation of G 8 , which is supposed to override G 6 . 
Therefore, it is not reasonable to use any of them. As a consequence, none of the 
conclusions of G 6 , G7 and G 8 should be derived. However, it can be shown that 
each o-model of P 5 contains both tom [major ^ math] and tom [math-ability ^ 
average]. So every o-model of P 5 does not serve as its reasonable meaning. 

Now consider the proposed semantics. It is simple to see that is the 

set {tom [math-grade ^ b], tom [favourite-subject ^ math]}, which is the correct 
skeptical meaning of P 5 (7e., the meaning obtained in the usual way after dis- 
carding the conflicting clauses G6 and G8). □ 



5 Related Works and Conclusions 

Defeasible inheritance has been intensively studied in the context of inheritance 
networks [7,12,13]. Although the process of drawing conclusions from a set of 
defeasible hypotheses in inheritance networks is quite different from the process 
of deduction (as pointed out in [7]) and these works do not discuss dynamic 
method resolution, they do provide the presented approach with a foundation 
for determining the domination relation among ground clauses. A type hierarchy 
and a membership relation can be represented as a network, and the domination 
relation can then be determined based on the topological structure of the net- 
work. For example, if there exists a path from an object o through a type t to 
a type t' in the network, then it is natural to suppose that the ground method 
definitions for o inherited from t dominate those inherited from t' . 

Besides [5], distinguished proposals that incorporate inheritance in the con- 
text of logic-based deduction systems include [1,2, 3, 8 ]. However, in [1] and [ 8 ], 
inheritance is realized by other means than typed substitution; 7e., [ 1 ] cap- 
tures inheritance by transforming subclass relationships into rules of the form 
class {X) ^ subclass{X)^ and [ 8 ] models inheritance as implicit implication on 
interpretation domains (called H-structures). [2] and [3] incorporate inheritance 
into unification algorithms but do not discuss nonmonotonic inheritance. 

This paper studies the interaction of inheritance, realized by means of typed 
substitution, and deduction, and proposes a framework for discussing a declara- 
tive semantics for definite declarative programs with nonmonotonic inheritance. 
The framework uses a domination relation on program ground clauses, specifying 
their priority, as additional information for resolving conflicting method defini- 
tions. With a specified domination relation, a program is transformed into an 
argumentation framework which provides an appropriate structure for analyzing 
the interrelation between the intended deduction and domination. The meaning 
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of the program is defined based on the grounded extension of this argumenta- 
tion framework. Method resolution in the framework is dynamic with respect 
to the applicability of methods. The paper not only shows that the proposed 
semantics and Bobbie and Topor’s perfect model (with overriding) semantics [5] 
coincide for inheritance-stratified programs (Theorem 3), but also claims that 
the proposed semantics provides correct skeptical meanings for non-inheritance- 
stratified programs. 
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Abstract. Entailment of subtype eonstraints was introdueed for eonstraint sim- 
plifieation in subtype inferenee systems. Designing an effieient algorithm for sub- 
type entailment turned out to be surprisingly dififieult. The situation was elarified 
by Rehof and Henglein who proved entailment of struetural subtype eonstraints 
to be eoNP-eomplete for simple types and PSPACE-eomplete for reeursive types. 
For entailment of non- struetural subtype eonstraints of both simple and reeursive 
types they proved P SPACE -hardness and eonjeetured PSPACE-eompleteness but 
failed in finding a eomplete algorithm. In this paper, we investigate the souree of 
eomplieations and isolate a natural subproblem of non-stmetural subtype entail- 
ment that we prove PSPACE-eomplete. We eonjeeture (but this is left open) that 
the presented approaeh ean be extended to the general ease. 

Keywords: subtyping, eonstraints, entailment, automata. 



1 Introduction 

Subtyping is a natural concept in programming. This observation has motivated the 
design of programming languages featuring a system for subtype inference [8, 11, 2, 
6, 18]. Simplification of typings turned out to be the key issue in what concerns the 
complexity of subtype inference systems [7, 19, 17]. Several authors proposed to sim- 
plify typings based on algorithms for subtype entailment, i.e. entailment of subtype 
constraints [22, 16]. First approaches towards subtype entailment seem to presuppose 
[22, 16] that the problem could be solved efficiently. But finding an efficient algorithm 
for subtype entailment turned out to surprisingly difficult [9, 20, 10, 18]. And in fact, 
it still remains open whether subtype entailment is decidable, even if restricted to an 
inexpressive type languages. The most prominent open problem is the decidability of 
entailment between non-structural subtype constraints. 

Types. A simple type is finite tree built from a signature S of function symbols 
(i.e. a ground term over U). A recursive type is an infinite tree over U. Most typically, 
A' contains the constants int and real and the binary function symbol x for pairing. 
The type of a pair of integers, for instance, is the finite tree intxint The signature S 
typically also provides constants A and T for the least type and the greatest type. 

Many further types are of interest for programming: contra- variant function types 
r^r', record types {fi'.Ti , . . . , intersection and union types, and polymorphic 

types. These types fall out of the scope of the present paper. In order to keep subtype 
entailment simple, we restrict ourself to types that are finite or infinite trees built from 
a signature A C {int, real, x , A, T}. 
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Fig. 1. Structural versus non-structural subtyping 



Subtyping. When considering types as trees, subtyping becomes a partial order on 
trees. A typical subtype relationship is int < real which states that every integer can be 
used as a real (its relevance is discussed in depth in [12]). The former subtype relation- 
ship induces intx int < walx wal which means that every pair of integers can be used 
as a pair of reals. Both relationships are structural in that they relate trees of the same 
shape. Subtyping becomes non-structural in the presence of a least type ± or greatest 
type T, since _L<r and r<T hold for all types r. The difference between structural and 
non-structural subtyping is illustrated in Figure 1. 

Subtype Entailment. A subtype constraint is a logical description of types whose 
interpretation is based on the subtype relation. We assume a set of type variables ranged 
over by x, y, z. A subtype constraint t/; is a conjunction of ordering constraints t<t' 
between terms t, t' built from variables and function symbols in S. Subtype entailment 
is the problem to decide whether an implication ^ x<y valid in the structure 
of trees, i.e. whether \= x<y holds. Four cases are to be distinguished: either we 
interpret over finite trees (simple types) or else over possibly infinite trees (recursive 
types); either we consider non-structural subtyping where J_,T G A' or else structural 
subtyping where A, T ^ U. The differences between these cases can be illustrated at the 
following example. 

A xxx<y 1= x<y 

First, we consider structural sub typing with the signature U = {int, real, x }. For finite 
trees, the left hand side is unsatisfiable and thus entailment holds. For infinite trees, 
there exists a unique solution where both x and y are mapped to the complete binary 
tree pz.z x z\ thus entailment holds again. Second, we consider non-structural subtyping 
with the signature U = {T, A, int, wal, x}. There are many more solutions than for 
structural subtyping. For instance, the variable assignment mapping x to Ax (T x A) and 
y toTx (AxT) is a solution of x<yxy A xxx<y which contradicts entailment of x<y 
for both finite and infinite trees. 

Open Problem. Early algorithms for subtype entailment were incomplete [22, 16, 
18]. The situation was clarified by Henglein and Rehof who determined the complex- 
ity of structural subtype entailment: for simple types, it is coNP-complete [9] and for 
recursive types it is PSPACE-complete [10]. However, the complexity of non-structural 
subtype entailment remains open; it is at least PSPACE-hard, both, for finite and infinite 
trees [20, 10]. It is even unknown whether non-structural subtype entailment is decid- 
able. Nevertheless, Rehof conjectures P SPACE- completeness (see Conjecture 9.4.5 of 
[ 20 ]). 

Contribution. In this paper, we investigate the source of complications underlying 
non-structural subtype entailment. To this purpose, we introduce an extension of fi- 
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nite automata that we call P-automata and illustrate the relevance of P-automata to 
non-structural subtype entailment at a sequence of examples. P-automata can recog- 
nize nonregular and even non- context-free languages, as we show. This fact yields new 
insights into the expressiveness of non-structural subtype entailment. 

Based on the insight gained by P-automata, we isolate a fragment of non-structural 
subtype constraints for which we prove decidability of entailment. We consider the sig- 
nature T, X } and both cases, finite and possibly infinite trees respectively. The only 
restriction we require is that ± and T are not supported syntactically, i.e. that constraints 
such as X xT<z are not cannot be written. 

The algorithm we present is based on a polynomial time reduction to the universality 
problem of finite automata (which is PSPACE-complete). The idea is that more general 
P-automata are not needed for entailment of the restricted language. Our algorithm 
solves an entailment problem in PSPACE that was proved PSPACE-hard by Rehof and 
Henglein [10]. Its correctness proof is technically involved; it shows why nonregular 
sets of words - as recognized by P-automata - can be safely ignored. 

Related Entailment Problems . Several entailment problems for constraint languages 
describing trees are considered in the literature. Two of them were shown PSPACE- 
complete in [14, 15]. The common property of these PSPACE-complete entailment 
problems is that entailment depends on properties of regular sets of words in the con- 
straint graph. In contrast, nonregular sets have to be taken into account for non-structural 
subtype entailment. 

In feature logics, several languages for describing feature trees (i.e. records types) 
have been investigated for entailment. Entailment for equality constraints over feature 
trees can be decided in quasi linear time [1,21]. Ordering constraints over feature trees 
[5, 4] can be considered as record subtype constraints. Entailment of ordering con- 
straints over feature trees can be solved in cubic time [13]. However, entailment with 
existential quantification is PSPACE-complete again [14]. 

Entailment has also been considered for set constraints (i.e. constraints for union 
and intersection types). Entailment of set constraints with intersection is proved 
DEXPTIME-complete in [3] for an infinite signature. Entailment of atomic set con- 
straints [15] is proved PSPACE-complete in case of an infinite signature and 
DEXPTIME-hard for a finite signature. 



2 Non-structural Subtype Constraints 

We assume a signature U which provides function symbols denoted by / each of which 
has a fixed arity ar(/) > 0. We require that U contains the constants A and T, i.e. 
ar(±) = ar(T) = 0. We also assume an infinite set of variables ranged over by 

X, y, X, w, w. 

Paths and Trees. A path is a word of natural numbers n > 1 that we denote by tt, 
o, or a. The empty path is denoted by & and the free-monoid concatenation of paths 
TT and tt' by juxtaposition tttt', with the property that = Tre = tt. A prefix of a path 
TT is a path tt' for which there exists a path tt" such that tt = ttV". A proper prefix of 
TT is a prefix of tt but not tt itself. If tt' is a prefix of tt then we write tt' < tt and if tt' 
is a proper prefix of tt then we write tt' < tt. The prefix closure of a set of path II is 
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denoted as pr{II), i.e. pr(77) = {tt | exists n' e U : n < tv'} and its proper prefix 
closure with pr^{IJ), i.e. pr^{IJ) = {tt | exists tt' G 77 : tt < tt'}. 

A tree r is a pair (7J, 7) where 77 is a tree domain, i.e. a non-empty prefixed-closed 
set of paths, and 7 : 77 ^ 27 a (total) function determining the labels of r. We denote 
the tree domain of a tree r by Dr and its labeling function with 7^-. We require that trees 
T are arity consistent: for all paths tv e Dr and natural numbers 7 1 < i < ar(7r(7r)) 
iff Tvi G Vr. A tree is finite if its tree domain is finite and infinite otherwise. We denote 
the set of all finite trees with and the set of all trees with Treej]. 

Non-Structural Subtyping. Let be the least (reflexive) partial order on function 
symbols of D which satisfies for all f E 2J: 



^ <L / <L T 

We define non-structural subtyping as a partial order < on trees such that ti<T 2 holds 
for trees ri , T 2 iff for all paths tt G Dr^ H 77^2 it holds that 7^ (tt) <i^ Lr^ (tt). 

Let NSe be the structure with signature 27 U {<} whose domain is the set Twce- 
Function symbols in D are interpreted as tree constructors and the relation symbol < 
as non-structural subtyping (which we also denote by <). The structure NS^ is the 
restriction of NSe to the domain of finite trees Tree^. 

A term t is either a variable or a construction /(ti, - - - Dn) where ti, . . . , are 
terms, / G 27, and n = ar(/). Of course, A and T are terms since they are constants in 
27. A non-structural subtype constraint over 27 is a conjunction of ordering constraints 
ti<t 2 . We consider two cases for their interpretation, either the structure NSe or the 
structure NS^. We mostly use flattened constraints t/; of the following form: 

::= x=f{xi, ...,Xn) I x<y \ A fi' (/ G A, ar(/) = n) 

The omission of nested terms does not restrict the expressiveness of entailment. Terms 
on the left hand side of an entailment judgment can be flattened by introducing new 
variables for all subterms. Furthermore, \= t\<t 2 is equivalent to A ti<x A 
y<t 2 \= x<y where x, y are fresh variables. 

Satisfiability and Entailment. Let <P denote a first-order formula built from order- 
ing constraints with the usual first-order connectives and let V(7>) be the set of free 
variables in d>. We write in 7> if there exists such that 7> = A up to asso- 
ciativity and commutativity of conjunction. Suppose that ^ is a structure with signature 
27 U { < }. A solution in ^ is a variable assignment a into the domain of A such that 
E evaluates to true under A and a. We call E satisfiable in A if there exists a solution 
for 7> in A formula <P is valid in A if all variable assignments into the domain of A 
are solutions of E. A formula E entails E' in A, written E \=a if ^ ^ E' is valid in 

Restricted Language. Let 272 be the signature {_L, T, x } where x is a binary func- 
tion symbol. A restricted subtype constraints p has the form: 

p ::= u=uiXU2 I ui<U2 \ pi A p2 

The following restrictions are crucial for entailment as we will discuss later on: 1) The 
constraints x=D and x=~V are excluded. 2) The signature IJ 2 does not contain a unary 
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50 if X G V{'ip) then x<x in 'ip 

51 if x<y in pj and y<z in 'ip then x<z in pj 

52 if x=f{xi, . ..,Xn) A x<y A y=f{yi^ • • • ,yn) in tp then Xi<yt in tp 

53 not x=fi{xi, . . . ,Xn) in tp, x<y in tp, y=f 2 {yi, . . . ,yn) in tp, and fi /2 

54 not Xi=f {. . . , ^i+i, . . .) A r/i+i=Xi+i in tp where n > 1 and xi = Xn+i 

Table 1. Closure and Clash-freeness Properties: S0-S3 for NSjj and S0-S4 for ATS^^ 



function symbol. Nevertheless, the restricted entailment problem is not trivial. It is not 
difficult to see and proved by Rehof and Henglein [10] that entailment of the restricted 
language can express universality of non-deterministic finite automata; thus: 

Proposition 1 (Hardness). Non-structural subtype entailment for the restricted con- 
straint language is P SPACE hard for both structures NSx ;2 

We next recall a closure algorithm from [10] which decides the satisfiability of 
(unrestricted) non-structural subtype constraints f over an arbitrary signature U. In 
Table 1, a set of properties S0-S4 is given. The properties for NS^ and NSe differ only 
in an additional occurs check for the case of finite trees ( S4). Reflexivity and transitivity 
of subtype are required by (SO) and (Si). The decomposition property of subtyping is 
stated in (S2), and clash-freeness for labeling in (S3). 

We call a (flattened) constraint f closed it it satisfies S0-S2. Properties S0-S2 can 
also be considered as a saturation algorithm which computes the closure of a (flattened) 
constraint f in cubic time. A constraint f is clash-free for NSe it it satisfies S3 and for 
NSA if it satisfies S3-S4. 



Proposition 2 (Satisfiability). A constraint is satisfiable in NSe (resp. NSy ) if its 
closure is clash-free for NS e (resp. NS^ ). 

3 P-Automata 

We now present the notion of a P-automaton on which we will base our analysis of 
subtype entailment. A P-automaton is an extension of a finite automaton with a new 
kind of edges. Let A = (A, Q,fF, A) be a finite automaton with alphabet A, states Q, 
initial states f final states h\ and transition edges A. We write A h p q for states 
p,q ^ Q and tt G A* if the automaton A permits a transition from pto q while reading 
7T. Thus, A recognizes the language C{A) = {tt G A* | A h p qp^jq^ 

Definition 3. A P-automaton V is a pair (A, P) consisting of a finite automaton A = 
(A, Q,fF, A) and a set of P-edges P C Q x Q between the states of A The P- 
automaton V recognizes the language C(V) Q A* given by: 

C{V) = C{A) U |^{7r(a^)*cr | A h p AA q AF, r s, (s, q)^P, pGi, reP} 
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A P-automaton recognizes all words in the 
language of the underlying finite automaton. In 
addition, it is permitted to use P-edges as fc-edges, ' ■' 

except that the first usage of a P-edge determines the period of the remaining word to 
be read (the period is ag in Definition 3). We draw P-edges as dashed lines. 



Example 4. Consider the P-automaton with alphabet {1,2}, states 
{q, s}, initial and final state q, edges q — ^ s and q — ^ s and a sin- 
gle P-edge {s,q). The automaton can loop by using its P-edge multiply, 
but the first usage determines a period (the word 1 or 2) of the remaining 
word. Thus, the language recognized is 1* U 2* rather than (1 U 2)*. 



I" 




The length of a period fixed by a first usage of a P-edge needs not to be bounded by 
the number of states of the P-automaton. This fact raises the following problem. 



Lemma 5 (Failure of context-freeness). There exists a P-automaton whose language 
is not context-free (and thus not regular). 



Proof We consider the P-automaton with alphabet {1,2}, states 
{g,r, s}, initial states {g}, final states {r}, transition edges q r, 
r — ^ r and r — ^ s and a single P-edge (r,q). This P-automaton is 
depicted to the right. It recognizes the language |J{P^(^*) I ^ C 1*2}. 
which is not context-free. Otherwise, the intersection with the regular lan- 
guage 1*21*21*2 would also be context-free. But this intersection is the 
language {1^21^21^2 | n > 0} which is clearly not context-free. 




4 Path Constraints 

We now introduce path constraints which express properties of subtrees at a given path. 
Path constraints are fundamental in understanding entailment for many languages of 
ordering constraints [9, 10, 20, 14, 13, 15]. 

If r is a tree and tt G Ur then we write t.tt for the subtree of r at tt, i.e. Dr.n = 
{tt' I tttt' g Dr} and Lr.7r(7r') = Lr{7V7v') for all tt' g Dr.w A subtree constraint 
x.Tv=y requires that the domain of the denotation of x contains tt and that its subtree at 
path TT is the denotation of y. 

Conditional path constraints of the first kind we use are of the form xla 
yliv. The question mark indicates conditionality depending on the existence of a 
subtree. A path constraint xla <j^ yliv is solved by a variable assignment o; if 
TTi e and 7T2 e implies -L„(^)(7 Ti) ^ ^a{y){Tr 2 )- We freely omit 

the conditionality ?& since it & path does always exist. We also write x?a f in- 
stead of 3y{x7a <l y E Z/</(T, . . . , T)), and, symmetrically, / x7a instead of 
3y(/(-L, . . -,±)<y /\y <L x7a). 

Proposition 6 (Characterization of Entailment). For all w, v the following equiva- 
lence is valid in NSx!r^ •' 

u<v ^ y\{w?7T < 1 ^ v7tv I TT a path } 
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We call a path tt a contradiction path for f |= x<y if and only if f does not entail 
< 1 ^ yliv. In this terminology, Proposition 6 states that an entailment judgment 
f 1= x<y holds if and only if there exists no contradiction path for it. 

We need further conditional path constraints of the form xliv<f^y, x<f^ylo, and 
xlivff^ylo which do not only restrict x and y at the paths tt and o but also at their pre- 
fixes. The semantics of these constraints is defined in Table 2. Note that the path con- 



xln<^^y 

x<^^ylo 

x?7v<^^y7o 



x.7v<y V V, o:.7r'<_L 
x<y.o V 

3u{xln<P^u ^u<P^ylo) 



Table 2. Semantics of conditional path constraints 



straint xluff^y entails 3z{x.iv=z z<y) but not vice versa. The reason is xliv<f^y 
constrains x even if x.iv is not defined. For instance, the constraint x<f{y) entails 
xll<f^y which - if xl is not defined - requires x<_L. 

For a restricted signature, the semantics of conditional path constraints is much less 
ad hoc than it might seem at first sight. This is shown by Lemma 8 for the signature 
Un = {-L, T, g} where ^ is a function symbol with ar(^) = n. 

Lemma 7. For n>l, signature Ff = {_L, T, paths tt G {1, . . . , n}* and tt' < tt.- 
ulTv<f^'v uln' < 1 ^ g and u<f^'vl7v ^ g <l valid in NSx!n- 

Lemma 8 (Subtree versus conditional path constraints). For n > 1, paths tt g 
{1, . . . , n}* and variables x,y the following equivalences hold in the structure NSx!r^ •' 

xlnff^y ^ 3z{x<z A z.7v=y)^ xff^yliv ^ ^z{z<y A z.tv=x) 

U.7T=V ^ ul7i<f^v A v<f^uhv 

Proof We only prove the implication from the right to the left in the third equivalence. 
Assume that uliv<f^v A v<f^uhi. For arbitrary tt' < tt. Lemma 7 proves the validity 
of w?7t' <l 9 and g uh\' . Thus, w.tt must be defined, and hence, w.tt = v. 

Lemma 9 (Strange loops raising P-edges). For all variable w, v, all paths a < tv, and 

k >0 the following implication is valid in NSx;r^ ^ ^ 0.- 

ul7v<f^'v A uff^'vliv w?7T^a <L v^TV^cr 

Proof By induction on k. Let A: = 0. Since a < tt . Lemma 7 proves that uliv<f^v A 
u<f^vliv entails ula <l 9 b, 9 which in turn validates ula <j^ via. Suppose 

A: > 0 and that uliv<f^v A u<f^vliv is valid. By definition, uliv<f^v ^ w . 7 t < i ; V 
\/ ^<^u.Q<l. and v<f^vliv ^ u<v.tv V \/ ^^^T<u.g hold. If there exists g < 
TV such that u.g<l. or T <u.g then u.g<v.g is entailed for some prefix of tv. In this 
case, ulTv^a <j^ vItv^u follows from Proposition 6. Otherwise, u.tv<tv A u<v.tv is 
valid. Let u' be such that u' = w.tt and v' = w . tt . Thus, u'<v A v.tv=v' holds 
and entails u'1tv<^^v' by Lemma 8. Symmetrically, u'<^^vFtv is entailed, too. The 
induction hypothesis yields u'lTv^~^a <j^ v'lTv^~^a and thus ulTv^a <j^ vlTv^a. 
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Signature 


rn = {±,T,^} 


ar{g) = n 


Alphabet 


An = n} 




States 


Qp, = {{u,v) \u,v ^ V{f)} 




Intial States 


J-xy = {(^5 y)} 




Increase 


(u,v) (u\v) G Ap; 


if u<u' in Ip 


Decrease 


(u,v) (u,v') G Ap; 


if v' <v in f 




(u,v) -A^ (ui,Vi) G 1 

(u,v) e Pp; ) 


II 


Decomposition 


if } v=g(vi,. 




y and i G . 


Equality 


(u,u) Al^ (u,u) G Apj 1 
(pUj^u^ G T -0 J 


if i G An 


P— Edges 


{(u,v)fv,u)) G Pp: 


ifu,v G V(V^) 



Table 3. The finite automaton Z\^) and P-automaton 

(A^,P^) for 'll) 1= x<y 



5 Entailment and P-Automata 

We continue with the signature Un = {-J-, T, g} where ar(^) = n. We fix two variables 
x,y globally and consider a constraint t/; with x,y e V(^). In Table 3, we define a 
finite automaton and a P-automaton for the judgment V' H x<y. 

Note that A^ and thus depend on our global variables x and y. 

The idea is that the P-automaton recognizes all safe paths, i.e those paths that 
are not contradiction paths for f |= x<y. In fact, the definition of does not always 
achieve this goal. This is not a problem for the purpose of this paper since our theory 
will be based exclusively on the regular approximation of provided by the finite au- 
tomaton A^p . Even though the construction rules given in Table 3 apply without further 
restriction to an automaton Vp, may well be useless if f is not closed and clash-free, 
or contains ± and T. 

Given a constraint f over the signature 27^, the automata Vp, and A^j constructed 
in Table 3 recognize words over the alphabet n}. Its states are pairs of variables 

(w, v) in V(^). The initial state is (x, y), i.e. the pair of variables for which entailment 
is tested. Ordering constraints u<v correspond to ^-transitions in the rules Increase 
and Decrease. The Decomposition rule permits transitions that read a natural number 
i e An and descend to the i-th child, in parallel for both variables in a state. States to 
which decomposition applies are final. The Equality rule requires that states (w, u) are 
final and permitted to loop into itself The automaton Vp, features P-Edges for switching 
the two variables in a state. 

Proposition 10 (Soundness). Given a constraint f with x,y ^ V{f) cind signature 
Sn where a > 1, no word recognized by the P-automaton Vp, is a contradiction path 
for f 1= x<y. 

Proof We first show that tt G C{Aip) implies entailment f ^ x?7t yliv to hold. 
Clearly, if A-^p h {u^v) {u'^v') then f entails vPi{<f^u' and If tt G 
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1- 


x<yxy A xxx<y ^ x<y 




Entailment can be 
contradicted at paths 12, 21, , , . 


\ n- 





/ \ 
s s 
/ \ / \ 



/ \ , 

S 

/ \ 

s 

/ \ 
s •=* 

/ \ . 

s 

/ \ 



/ \ 
c* s 
/ \ 

•=* s 
/ \ 
c* s 
/ \ 

C* S 

/ \ 



Fig. 2. The finite automaton and P-automaton for Example 1 1 and their languages 



C{A^p) due to a transition h {x,y) {u,v) e which ends in a final state 

{u,v) created by the Decomposition rule, then ivi e C{A^p) for all i G Thus, 
entails xh^i^^u and v<^^y?7ii for some variables w, v which in turn entails xItv <j^ 
yliv (Lemma 7). If tt G C{A^p) because a transition A<^ h (x, y) AF, ^ 

in a final state (w, u) contributed the Equality rule then entails xluF^^yliv and thus 

X?7T < 1 ^ y?7T. 

It remains to verify that P-edges cannot contribute a contradiction path. If a 
path is contributed by a P-edge to JC{V^) then it has the form 7v{ag)^a such that 
A^ h {x,y) AF, (n,t?) AZ^ {v^u) for some u,v e V^^ip) (see Definition 3 and 
the P-Edges rule in Table 3). From A^ h (u^v) AZ^ {v^u) it follows that en- 
tails u7ag<^^v A u<^^vlag. Thus, Lemma 9 on strange loops implies that en- 
tails u7{ag)^a ^ v?{ag)^a. Since A^ h (x^y) AL^ (u^v) it follows that entails 
x?7v{ag)^a <j^ yliv{ag)^ a, too. 

Example 11. For the signature F 2 the judgment (/P 2 - ^Ayxy A xxx<y |= x<y does 
not hold if x and y are distinct variables. Entailment is contradicted by the solution of 
P 2 which maps x to _Lx (T x_L) and y to Tx (-LxT). The P-automaton Vcp 2 illustrated 
in Figure 2 explains what happens. The finite automaton Acp^ recognizes the language 
{e} only but Vcp 2 has an additional P-edge from (y, x) to (x, y) by which it can also 
recognize the words in 1+ U 2+. Since P-edges are not normal 6:-edges, the P-automaton 
does not recognize the words 12 nor 21 which are in fact contradiction paths. 

In Figures 2 and 3, we depict the language recognized by an P-automaton over 
the alphabet {1, . . . , n} as an n-ary tree: a word recognized by the underlying finite 
automaton corresponds to a node labeled by x, a word recognized by the additional 
P-edges only is indicated by a node labeled with s (for strange loop). All other words 
correspond to a node labeled with c (for contradiction). 

Example 12. For the signature = {_L,T,^} with ar(^) = 1 the entailment judg- 
ment pi : x<g{y) A g{x)<y |= x<y holds. This might seem surprising since the 
only difference to Example 1 1 seems to be the choice of a unary versus a binary func- 
tion symbol. The situation is again clarified when considering the P-automaton. The 
automaton is given in Figure 3. In contrast to V <^2 Figure 2, the alphabet of 
is the singleton {1}. Thus, its language = {1}* is universal. Hence, there 

cannot be any contradiction path for pi ^ x<y, i.e. entailment holds. 
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x<g{y) t\ g{x)<^y \= x<y 
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Entailment holds. 
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Fig. 3. The finite automaton and P-automaton for Example 12 and their languages 



Examples 1 1 and 12 illustrate that P-edges have less effect on entailment in absence 
of unary function symbols. In fact, we show in this paper that P-edges do not have any 
effect on entailment for the restricted language. Even more importantly, this property 
depends on the restriction that constraints u=± or u=T are not supported. 

The context freeness failure for languages of P-automata has a coun- 
terpart for non-structural subtype entailment, even for the restricted lan- 
guage. This is illustrated by the judgment: (fs : x<u A v<y A u=uxy A 
v=vxx \= x<y. The language is not context-free since is 

exactly the P-automaton considered in the proof of Lemma 5. On the other 
hand side, the non-context free part of does not force entailment 

to hold. 

6 Deciding Entailment in PSPACE 

We now show how to decide entailment for the restricted entailment problem with 
signature 272. Our algorithm requires polynomial space and applies to both structures 
NSe 2 or respectively. The only difference is hidden in the satisfiability test used. 
Let NS be either of the two structures. 

Proposition 13 (Characterization). Let be a closed (restricted) constraint with 
x,y e V{(f) which is clash-free with respect to NS. Then the entailment judgment 
p 1= x<y holds in NS if and only if the set C{Acp) is universal, i.e. C{Acp) = {1,2}*. 

Proof If C{A<fj = {1, 2}* then no contradiction path for p |= x:<y exists (Proposition 
10) and hence p x<y holds (Proposition 6). Proving the converse (completeness) is 
much more involved. This proof is sketched in Section 8. 

Theorem 14 (Decidability and Complexity). Non-structural subtype entailment in 
the restricted language is PSPACE-complete for both structures NSx !2 

Proof Proposition 1 claims that entailment is PSPACE-hard. For deciding p |= x<y, 
we compute the closure of p in polynomial time and check whether it is clash-free 
with respect to NSe 2 NS^ff^ respectively (Proposition 2). For closed and clash-free 
p, entailment holds if and only if C{A<fj is universal (Proposition 13). This can be 
checked in PSPACE since A^ is a finite automaton which can be constructed from p in 
(deterministic) polynomial time. 
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7 Completeness Proof 

We prove the completeness of the characterization of entailment in Proposition 13. For 
a constraint Lp of the restricted language, the idea is that we can freely extend the P- 
automaton Pep) with additional P-edges without affecting universality. This mo- 
tives to consideration of a language TracQp which is recognized by the P-automaton 
(Aep, QcpXQcp) where is the set of all states of Aep. 

Definition 15. We define the set BasCp of bases and TracCp of traces of (f \= x<y by: 

BasCep = {tt I 3u,v : Aep h (x,y) 

TracCep = \J{pr{o7r'^) \ on e Base^p} 



Lemma 16. The set pr^{Basap) is equal to the set C{Acp). 

Proof Showing that & G C{Aep) implies & G pr^{BasGep) is left to the reader. If ni G 

jC{Acp) then Aep h (x, y) -PA (n, v) for some (final) state (w, v). Thus, Tri G Base, i.e. 
7T G pr^{Base<p). For the converse, assume n G pr^{Base<p). Hence, Tri G Base^ for 

some i. There exists transitions Aep h {x,y) {u,v) -Pr {v/ ,v') with a final step 

done by the Decomposition rule in Table 3. Thus, (w, v) G T]p, i.e. n G C{A<p). 

Lemma 16 implies that C(V<p) C Trace^. The next proposition states that if C{A<p) 
is not universal then neither TracQp nor C{V<p) are universal. 

Proposition 17 (Escape). If a ^ C{Acp) then there is a path g ^ TracCp with (J < g. 

Proof We assume a ^ C{A<p) and define g := where |a| denotes the length 

of a and a word which consists of exactly n letters 1. We prove g ^ TracQep by 
contradiction. Suppose that g G Trace^. By definition there exists paths o, n such that 
07T G BasCep and g G pr(o7r*). Hence a G pr(o7r*) such that either a < on or on<a. 
It it not possible that a < on since otherwise, a G pr^(on) C pr^{Base<p) which by 
Lemma 16 contradicts a ^ C{Aep). Hence on<a such that a = ona^ for some path ag. 
In combination with g = G pr{pn'^) this yields aoll^l2 G pfin'^). Furthermore, 

|7t| < |o7t| < |a|. The key point comes now: aoll^l2 G pr(7r*) and |7 t| < \a\ imply 
7T G which is impossible since n must contain the letter 2. Hence, g ^ TracCep. 



Lemma 18 (Contradiction). Let p be closed and clash-free, o ^ C{Acp), and g f 
TracGep.' if o<g then g is a contradiction path for p \= x<y in NS. 

Proof of Proposition 13 continued (Completeness). If C{A<p) is not universal then 
there exists a path o<g such that o ^ C{A<p) and g ^ Trace p according to the Es- 
cape Proposition 17. By Lemma 18, there exists a contradiction path which proves that 
entailment p x<y cannot hold. 
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h xle<^^y \ix<y in 

'ip h xlTii<^^y ifi/j h xlii<^^ z and z=f(zi, ... Zi ^ Zn)^ in 
ip h x<^^yle \ix<y in pj 

tp h x<^^y7ivi if tp \- z<^^y?7v and z=f{zi^ . . . Zi . . . ^ Zn)^x<Zi in ip 
Ip h x?7v<^'^y?7v' ifBz'.pj}- x?7v<^'^ z and pj h z<^^yli{' 

Table 4. Syntactic Support 



8 Proof of the Contradiction Lemma 

In a first step, we refine the contradiction Lemma 1 8 into Lemma 2 1 . This requires a 
notion of syntactic support that is given in Table 4. If // is a path constraint then the 
judgment h // reads as V supports p syntactically’. Syntactic support for p refines 
judgments performed by the finite automaton A<^. For instance, it holds for a closed and 
clash-free constraint p that p h x?7v<P^y?7v iff h (x^y) (w, w). Judgments like 

p h xli{<^^y ox p^ xli{<^^yl7v' cannot be expressed by A^^. 

Lemma 19. For all path constraints p if p\~ p then p \= p^ holds. 



Definition 20. We define two functions and for the judgment p |= x<y. 

l(p{o~) = max{7T I 7v<a A 3u.p h x?7vl<^^u} (l^fi) 

L^{(j) = max{7T I 7T<a A 3v.p h v<f^ ylnl} fight) 

Note that if i^(cr) < r^(cr) then i^(cr) is the maximal prefix of a in C{A<fj. Sym- 
metrically, if T<^{a) < lcp{cr) then r^((r) is the maximal prefix of a in C{A<fj. 

Lemma 21 (Contradiction refined). Let p be a closed and clash-free constraint and 
o<Q paths such that o f. C{Acp) and g ^ TracCcp- 

1. iflcp{g) < Tcp{g) then p A x.q=T A y.g=L'X is satisfiable. 

2. ifl^{g) > x^{g) then p A x.g=x A y.g=LL is satisfiable. 

Trivially, Lemma 21 subsumes the contradiction Lemma 18. The proof of Lemma 
21 captures the rest of this section. Since both of its cases are symmetric we restrict our- 
self to the first one. We assume that p is closed and clash-free and satisfies x,y G V{p). 
Given a fresh variable u we define a constraint s{PjQ) that is satisfaction equivalent to 
p A x.g=u A y^g=L x and in addition closed and clash- free. 

Definition 22. We call a set D C {1,2}* domain closed if D is prefixed-closed and 
satisfies the following property for all tv G {1,2} *.• 7vl ^ D ijf 7v2 ^ D. The domain 
closure dc{D) is the least domain closed set containing D. 
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Definition 23 (Saturation). Let ^ be a constraint, x,y e Q C {1,2}*. 

For every z G {x,y} and tv G dc({^l, ^2}) let be a fresh variable and W{(f, g) the 

collection of these fresh variables. The saturation s{(f,g)of(fat path g is the constraint 
of minimal size satisfying properties a-f: 

a. p in s{p, g) 

b. for all ql € W{'~p, g) : ql=qh X q *2 i" ^V-P^ Q)ifT^ <Q 
c- q^=qeiXq^2 ins(93,^) 

d. for all q^ € W(if, g),u €V{if) : q^<u in s{(p, g) if (p \~ zlirff^u 

e. for all q^ € g)^u e V{p) : u<q^ in s{p, g)ifpi~ u<f^' zliv 

f forallql^^fP € WVp,g) : ql^ffP in s{p , g) ifpV- zlo<f''z'lo' 

Lemma 24. Ifp is closed and clash-free then s((/j, ^>) is also closed and clash-free. 

Lemma 24 would go wrong for unrestricted constraints containing ± or T. Its proof 
is not difficult but tedious since it requires a lot of case distinctions. We omit it for lack 
of space. Instead we note the following lemma which despite of its simplicity will turn 
out to be essential. 

Lemma 25. Let a and o be paths with a<oa. If of e then a G pr(o*). 

The proof of Lemma 25 is simple and thus omitted. We can now approach the final 
step in which we show that s(p, g) A q®=T is also closed and clash-free. Closedness 
follows trivially from Lemma 24 but clash-freeness requires work. The only clash rule 
which might possibly apply is S3. Since ± does not occur in s{p,g), S3 can only be 
applied with q®=T andT^^x,i.e. if there are u?, u?i, u ?2 G V(s(p,^)) such that: 

w=wiXW 2 in s{p,g) and q^<w in s(p, g) 

We have to distinguish all possible choices of u? G V(s(p, ^)) but restrict ourself to the 
more interesting cases where w G W(p, g). In this case, q^<yj was added to s(p, g) 
by rule f in Definition 23. Since w=wiXW 2 in s{p^ g) and w G W(p, g) it follows that 
w = q^, or w = q^ for some tv < g and x G {x, y}. 

1. Case w = qy-. Rule f requires p h xla<f^yla for some prefix a<g. This is equiv- 

alent to h {xpy) (w, v) for some u. The Equality rule in the automaton 
construction yields h {x,y) (w,w). Thus, g G C{Aip) which contradicts 

g ^ Trace^. 

2. Case w = where tv < g and z G {x,y}: Rule f requires the existence of 
a, g' ^ tv' such that p h xF. g' zItv' where g = g'o and tv = tv' o. From tv < g it 
follows that tv' a < g'a and thus tv' < g'. Let o f she such that tv'o = g'. Thus, 
tv' a < tv' OG which in turn yields a < oa. The key point comes now. We can apply 
Lemma 25 in order to deduce u G pr(o*). Hence g = g'u = tv' oa G 7r'opr(o*) C 
pr(7r'o*). Since p h xF. f <f^ zItv' there exists u such that p h x?g'<f^u; together 

with our assumption i^(^) < r^{g) it follows that h {x,y) {upf for some 
w , V. Hence, g' G Base^, i.e. tv'o g Base^. Combined with g G prfy'of, we obtain 
g G TracCep in contradiction to our assumption. 
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Entailment holds. 
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Fig. 4. An example for the general case 



9 Conclusion and Future Work 

We have solved the problem of non-structural subtype entailment over the signature 
{i_, T, x} for the restricted language where ± and T are not supported syntactically. 
We have proved PSPACE-completeness both for simple and recursive types. We have 
presented the notion of a P-automaton and illustrated its importance for understanding 
non-structural subtype entailment. Because of its P-edges a P-automaton can recognize 
non context-free languages. In what concerns non-structural subtype entailment for the 
restricted language, we have proved that non regularity can be safely ignored. 

We believe that our methods can be extended to the full problem of non-structural 
subtype entailment. However, the full problem may well turn out to be more complex 
then PSPACE-complete. More research is needed to answer this question finally. The 
main problem in the general case is that we have to take P-edges into account. This is 
illustrated by the following example: 



Entailment holds even though the language of finite automaton for given in Figure 4 
is not universal. The construction rules for this automaton are more involved than in 
Table 3 since T has to be accounted for. A P-edge from (x, y) to (y , z) has to be added 
even though only one of the two variables is switched. 
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Abstract. We show that the cut-elimination for LKT, as presented in 
Danos et al.(1993), simulates the normalization for classical natural de- 
duction(CND). Particularly, the denotation for CND inherits the one for 
LKT. Moreover the transform from CND proof (i.e., Parigot’s A/x-term) 
to LKT proof can be considered as a classical extension to call-by-name 
(CBN) CPS-transform. 



1 Introduction 

What is LKT?: One constructive classical logic we consider is LKT presented 
by Danos, Joinet and Schellinx(DJS) [1]. It has long been thought that clas- 
sical logic cannot be put to use for computational purposes. It is because, in 
general, the normalization process for the the proof of classical logic has a lot 
of critical pairs. The normalization is the cut-elimination in case of the the 
Gentzen’s sequent-style classical logic: LK. It is the Gentzen’s theorem that LK 
has a Strongly Normalizing (SN) cut-elimination procedure. However it is not 
Church- Rosser (CR). LKT is a variant of LK which is equiped with SN and CR 
cut-elimination procedure. We say LK is constructive in this sense. The SN and 
CR cut-elimination procedure is called as tq-protocol. The CR property of tq- 
protocol is recovered by adding some restrictions on logical rules to LK. Despite 
of the restrictions, soundness and completeness w.r.t. classical provability is still 
retained in LKT. LKT is “classical logic” in this sence. 

What is Classical Natural Deduction?: The other constructive classical 
logic is classical natural deduction (CND) presented by Parigot [9]. Church’s 
A-calculus is widely accepted as the logical basis of functional programming. It 
is also well known that typed A-calculus has Curry- Howard correspondence with 
intuitionistic natural deduction. Parigot extends this idea to a classical logic: 
CND. Its computational interpretation is a natural extension of call- by- name 
(CBN) A-calculus, called A/x-calculus. The A/x-calculus, equiped with so called 
“structural reduction” rule, is known to be SN and CR [9]. This exactly means 
the normalization procedure (in the sence of Parigot) of CND is SN and CR. 
Therefore CND can also be considered as a constructive classical logic. Here- 
after we refer to Parigot’s A/x-calculus by A/xn in order to put stress on the fact 
that it is CBN. 
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Our Work: In this paper, we show how CND and LKT are related. More 
precisely, the normalization for CND can be simulated by tq-protocol for LKT 
(simulation theorem). Therefore the SN and CR property of A/in is shown to 
be a consequence of that of tq-protocol. Moreover we show the transform from 
CND proof to LKT proof can be considered as an essence of CPS-transform 
which is of computer science interest. 

This paper consists of two part. In first part, we present a new term calcu- 
lus for LKT. Hereafter we refer to this term calculus as A/i^-calculus. By this 
method, LKT which is a purely proof-theoretical artifact is set in correspon- 
dence with a simple term rewriting system. In second part, we show how A/xn 
can be simulated by A/x^^. Our investigation tool is a transform: it inductively 
transform the A/xn-term into the A/x^-term. As we mentioned before, A/xn can 
be considered as a classical extension to CBN A-calculus. Also, we already re- 
vealed that LKT is the target language of the CBN CPS-transform [7]. In sum, 
this transform can be considered as a classical extension to Plotkin’s CBN CPS- 
transform [10]. Note that Plotkin’s result is on the untyped case. However, we 
restrict ourself to well-typed case. This is because our investigation is purely 
proof theoretical. 

As far as the denotational semantics is concerned, the advantage of our 
method is clear. Our main result shows that A/xn can be simulated by tq-protocol. 
Tq-protocol can be simulated further by A-calculus [8]. Hence A/xn are shown to 
have the denotation in cartesian closed category. This observation confronts the 
categorical semantics approach such as “control category” of Selinger[ll] and 
“continuation category” of Hofmann and Streicher[6]. Moreover, by the DJS’s 
simulation theorem for linear decoration, our A/xn also inherits the denotation 
in linear logic’s coherent space semantics. This fact seems to be overlooked in 
the community of categorical semantics. 

Related Works: In the community of proof theory, our simulation theorem 
are considered informally by some people. Herbelin[5] published a detailed work 
on the term calculus of LJT, which is an intuitionistic fragment of LKT. He in- 
terpreted LJT proofs as programs and cut-elimination steps as reduction steps, 
just as we do here. DJS also mentioned to the relation between CND and LKT 
in [2](p.795). They refer to a manuscript for full details there, however, the 
manuscript is not open to the public yet. Hence it should be considered as an 
open problem. Moreover what is new to our work is how it is proved. We use 
the technique of CPS-transform. That is, we discover the relation between the 
constructive classical logic and CPS-transform. Needless to say, they are both 
important concept in each community. 

In the community of the theory of programming languages, De Groote de- 
scribes the CPS-transform of the A/xn in [3]. However his method was merely 
syntactic; he dose not mention to the relation to the proof theory. That is, the 
second part of our work can be seen as a proof theoretical recast of his work. 
In addition, our simulation theorem establishes more clear relation. We describe 
how De Groote ’s work can be recovered within our framework at the end. 
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Table 1. Original Derivation Rules for LKT 



n ; r ^ A 

LW 

n; r,A^A 

n; ro Ao, A 



Ax 

A ; A 

iT ; r, A, A ^ A 

LC 

n ; r,A^A 

A ; ri ^ Ai 
h-cut 



A; r ^ A 

D 

; r, A A 

n; r ^ A n\ r ^ a,A,A 

RW RC 

n; r a,A n; r => a, A 

; To Ao, A n; Ti, A ^ Ai 
m-cut 



iT ; ro,Fi Aq,Ai II] TojA ^oAi 

B ; To ^ Aq ; Fi ^ Ai, a n ] f, A ^ a, B 

R^ 

A ^ B ; Fq,Fi ^ Aq,Ai n ] f ^ a, a ^ B 



2 for LKT 

In the following, we use the word derivation, instead of proofs for a tree of 
derivation rules. 

First of all, we quote the original derivation rules for LKT from [1] in Table 1. 
^ is the ent ailment sign of the calculus. We use rhs and Ihs for the right- 
hand-side and left-hand-side of the entailment sign, respectively. The letter L/R 
stands for Left and Right introduction, D for Dereliction, C for Contraction 
and W for Weakening. Notice also that we use different names for cut rules. 
We use m-cut instead of “mid” and h-cut instead of “head” . This is needed to 
avoid confusion. Notice that, in their original definition, contexts are interpreted 
as multisets, and structural rules are explicit. 

2.1 Indexed Logical Systems 

Now we will explain our formulation of logical systems. We use indexed for- 
mula method which is firstly developed by Zucker to relate a Gentzen’s sequent- 
style derivation and a A-terms[12]. Parigot’s follows his method to relate a A/in- 
term and a CND derivation. 

In order to relate a term and a derivation, we need some way to specify for- 
mulas. For this, we change the notion of context. We interpret the context as the 
set of the indexed formulas. Accordingly, by contrast to the original, struc- 
tural rules are implicit. Formulas are that of first order propositional logic 
constructed from We use same implication symbol between logical systems. 
Indexed formula is an ordered pair of a formula and an index. We assume 
there are denumerably many A-indices (resp. /x-indices) ranged over x^y^z . . . 
(resp. q;,/?, 7 , . . .). We write an indexed formula {A^x) as A^ and a) as A". 
Sequents of each logical systems are of the form as follows: 



LK 

LKT 

CND 



r ^ A 

H] A 

A 




A CPS-Transform of Constructive Classical Logic 269 



where is a A-context which is a set of A-indexed formulas. Similarly, Z\ is a 
/x-context which is a set of //-indexed formulas. Comma means taking union as 
sets. Thus, the set Fq U A is denoted by “A, A” and {A^} U T by T”. 77 
denotes at most one unindexed formula. 

We only handle multiplicative rules in every logical system. That is, A- 
contexts (//-contexts) in the conclusion is the union of A-context s (//-contexts 
respectively) in the premises. For example, in of LK: 

{A > 5) , /05 A Ai 

Hereafter, for readability, we only write active and main formulas, and omit 
contexts as follows: 

^ A“ By ^ 

(A ^ Bf => ^ 

In the above, A and B are active formulas, while A ^ H is main formula. As 
we interpret contexts as sets, occurrences of formulas with the same index are 
automatically contracted. One can interpret this that binary rules are always 
followed by appropriate explicit contractions which rename the indices to the 
same name. We also interpret axiom rules contain appropriate weakenings as 
contexts. Therefore, we say, structural rules are implicit. Notice that in LKT, 
application of structural rules are restricted within F and A. Specifically 77 can 
not be introduced by weakening. 

We use TV for the derivation. Initial index is an index which appears for 
the first time in whole derivation. We assume all initial indices are distinct 
unless they are truly related (i.e., subject of further implicit contraction). This 
is possible, by introducing the “concatenation” to indices on every binary rules. 
See Zucker. 



2.2 Raw A//^-Terms 

We introduce a new notion for A//|^ -terms. In a word, it has a “bi-directional” 
form of application. They correspond to the two orientation of cut in Gentzen’s 
sequent-style classical logic. We explain this in the next subsection. Accordingly 
we have two kind of /7-contract ions. 

The raw A//-^-terms, ranged over 5,7,'^, etc., are defined as follows: 



s, t^u := X 

I XxA.t 

I fia^.s 
I s t 

I (s}t 

I N« 

I (x,l3).t 
I h{s,u) 



A-variable 

A-abstraction 

//-abstraction 

t-application 

q- application 

named-term 

R-term 

L-term. 
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In the above definition, t-application corresponds to the usual application (i.e., 
application in A-calculus). q- application is the new form of application in reverse 
direetion. We often omit the type of A- variable (//-name) in A-abstraction(/i- 
abstraction), in case it is clear from context. 



2.3 The Notion of Orientation of Cut 

To begin with, let us introduce briefly the notion of orientation of cut in 
tq-protocol[2]. This can be explained through the consideration “why LK has 
critical pairs?”. The most typical example of critical pairs are as follows: 

RW LW 

^ A 

cut 



This simple example shows the source of inconfiuency. Obviously, we have to 
choose whether the subderivation of the left premise of the cut or that of the 
right premise is going to be erased. The premise is going to be erased because it is 
introduced by weakening. We get two inconfiuent result according to the choice 
of the direction. This choice exactly is the orientation of cut. In LKT, the 
orientation of cut is fixed beforehand; two cut rules (h-cut or m-cut) represents 
the two orientation. Roughly speaking, this is how LKT recovers CR property. 

Now, our intention is to express cut-elimination by /^-reduction. Therefore 
we introduce new notation to express two orientation of cut as follows: 



s: t: 




s : 



\t: 



\x^.t} : 



The sequent in the box should be duplicated/erased. Now we adapt this idea to 
tq-protocol on LKT. We split f3 redex rule into two as follows: 



6 ^ 3 ^ 

{Xx^.t) ^ t[x := jj^a^.s] {Xh^ .t} .s) ^ s [o; := Xh^.t] 

where ^ denotes one step /^-reduction. In the rule above, by using the jar- 
gon in [2], the cut-formula is attraetive and main in logical rule. Hence the 
attracting subderivation represented by t is the transporting one. When this 
transportation passing an instance of contraction/weakening, it should be du- 
plicated/erased. This process corresponds to structural step in tq-protocol. We 
will come back to this point later. 

The A/x-^-term of the form {XxA.t) [jaa^.s) is called /?^-redex. Similarly, 
{Xx^ .tj) .s) is called /?^-redex. t is normal iff no subterm of t is neither [3^- 
redex nor /?^-redex. The result of /^-reduction on /?-redex is called /?-contr actum. 



2.4 Substitution 

t [x^ := s] means the standard substitution as meta-operation. In addition to 
this, we use /x-name substitution. The result of /x-name substitution: u [a^ := 
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Table 2. A/i-j^-Term Assignment for LKT 



[a]h: A ; ^ A° 



■Ax 



t: 



■D 



.t}x: ; 



s : ; ^ t: A^; ^ s : ; ^ t: ; A^ ^ 

h-cut m-cut 

(Xh.t}{iJ.a.s) : ; (Xx.t) {fj.a.s) : ; ^ 

u : ^ s : ; ^ A“ t : ; A* ^ 

^ 

h' {/la.s.Xh.u) \ (A ^ B)^ ; ^ ^ {A ^ By 



Xh.s] is obtained from u by recursively replacing every named subterm of u of 
the shape [ 0 ;]^; by s [h := v [a := Xh.s]]. Formally it is defined as follows: 



([ajr;) [a := Xh.s] = s [h := v [a := Xh.s]] 

{[yv) [a := Xh.s] = [f^]{v [o; := Xh.s])^ if a y f3 

We only show the base case. This definition says that named-term substitution 
is made of two substitution. First named-term is interpreted as a t-application. 
That is, the result of replacing a by Xh.s is (Xh.s){v [a := A/i.s]). Then this newly 
created redex is immediately reduced. Hence we get s[h := v [o; := Xh.s]]. Notice 
that h occurs exactly once in s. Since second substitution [h := v [o; := Xh.s]] 
dose not duplicate/erase the v [o; := Xh.s]. 

2.5 Definitions for A^^-Calculus 

Our A/i-term assignment for LKT is displayed in Table 2. 

Definition 1 (A/r^ as a reduction system). The reduction relation — 

of Xjiiy viewed as a rewrite system^ is defined to he the compatible closure of the 
notion of reduction defined by two redex rules ^ namely^ and . 

The — defines the full LKT-reduction on A/r-^ -terms. We write reflexive 
and transitive closure of — as . Hereafter we refer to the reduction 

system as A/x^ -calculus. 

Term assignment judgment is an ordered pair of A/x-^-term and indexed 
sequent. We write a judgment (s, 77 ; T ^ Z\) as 5 : II ; T ^ A. Derivation 
rules define the term assignment judgment. Derivation is a tree of derivation 
rules of which leaves are axioms, of which nodes are derivation rules other than 
axiom. We use tt for the derivation. We call the term s as an assigned term to 
the derivation tt, and refer to it by the notion of TermOf(Tr). Two derivations 
are said to be equal if they differ only up to indices of the formulas. Notice 
that this equivalence is identical with the weak equivalenee of DJS([2]) thanks 
to the indexed formula/implicit structural rules in our formulation. We say that 
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the derivation is cut-free if it contains neither m-cut nor h-cut. Let tt be the 
derivation of which last inference rule is m-cut. We call the TermOf(7r), which 
is a /?-redex, as m-cut redex. h-cut redex is defined as the same way. 

2.6 Some Properties of A^t^-Calculus 

Our term assignment faithfully reflects the structure of sequent calculus. Thus, 
inductive definition of substitution on terms, agrees with the induction on the 
length of derivation. We can state this formally as follows: 

Proposition 1 (subterm property). In every derivation rule, all terms of 
premises are ineluded as subterms in the term of the derivation rule. 

Proof. Mechanical checking of term assignment for each derivation rules. 



Proposition 2 (cut-freeness and normality). The derivation tt is eut-free 
iff TermOf^ir) is normal. 

Proof. (^) By induction on the length of derivation. E.g. for : By induction 
hypothesis s and t are normal, hence {Xy.t}{xs) is normal. (<^) Obvious, as 
term of of h-cut is of the form of /?^-redex, and term of m-cut is /?^-redex. 

Now we are ready to prove that our term assignment and bi-directional /?-redex 
simulates tq-protocol. 

Theorem 1 (tq-protocol simulated by normalization). If an LKT deriva- 
tion TV reduees to tv' by one tq-protoeol step, then TermOf{Tv) TermOf{Tv') by 
redueing the m-eut(h-eut) redex assoeiated to the m-eut(h-eut). 

Proof (sketeh). Naturally, the -terms are assigned to be compatible with 
tq-protocol. The rule Ax says that we identify the head-variable with the head- 
index. The rule D says that we identify the set of A-variables with the set of 
A-indices. It also says that we identify //-names and //-indices. Namely, in the 
term assignment judgment, each A-variable (resp. //-name) that occurs free in 
A//^-term is identified with the A-index (resp. //-index) of the same name. 

Structural Step (or S-step) of tq-protocol: S-step consists of two phases, 
namely S 1-step and S2-step. 

Elimination of h-cut corresponds to S 2- step. In case of h-cut, attraetive cut- 
formula is main in a logical rule (i.e., introduced by L^). Hence we transport 
attracting subderivation: t. Transporting t exactly correspond to the //-name sub- 
stitution: s [a := A//.t].(The definition of //-name substitution will be displayed 
later in this section.) In the process of //-name substitution, Xh.t is duplicated 
(erased, resp.) whenever passing an instance of (implicit) contraction (weaken- 
ing, resp.). That is, elimination of h-cut corresponds to S2-step in tq-protocol. 

Similarly, elimination of m-cut corresponds to S 1-step. That is, Sl-step is 
simulated by A-variable substitution. When the A-variable substitution reaches 
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to an instance of D rule, it turns into h-cut. That is, Sl-step will change into 
S2-step at the point of the occurrence of D rule. 

Logical Step (or L-step) of tq-protocol: L-step decompose the implica- 
tion (^). One may think of R-term as a kind of abstraction, and L-term as a 
kind of application. Moreover, an L-step can be considered as a communication. 
L-term: h{s^ u) sends the two subderivations s and u to R-term. R-term: {x^fi).t 
receives the two sub derivations by x and /?. Under these intuitions, we display 
how L-term and R-term works in simulating of L-step of tq-protocol: 

U-. B^- ^ s : ; ^ t: ; ^ 

^ ^ ^ 

h' {jia.s.Xh.u) \ {A ^ B)^ ; ^ ^ {A ^ 

h-cut 

{Xh' .h' {fj.a.s, Xh.u)}{fj.j.[j]{x, ^ 

reduces to: 

5 : ; ^ t: ; A^ ^ B^ 

T m-cut , 

(Xx.t) (fia.s) : ; => u: B^ ] => 

h-cut 

{Xh.v}{/j.f3.{Xx.t) {fj.a.s)) : ; => 

Obviously, {Xh' .h' {jj^a.s, Xh.u)}{jj.j,[j]{x^ f3).t) reduces to {{x^ f3).t) {jj^a.s, Xh.u) 
by /i-name substitution. However this is not equal to the series of application 
of m-cut and h-cut. Hence we need syntactic trick to mimic this. We define the 
syntactical equality as follows: 

((x, /?). t) (/io;.s, A/i.w) = {Xh.v} {/j.f3.{Xx.t) {/j.a.s)) 

{{x^ f3).t) {jj^a.s, Xh.u) can be regarded as representing simultaneous application 
of m-cut and h-cut. In fact, two cuts are commutative; we can change the order 
of m-cut and h-cut for free. This is the essence of CR property of tq-protocol. 
Let us say that this syntactical equality is needed to fill the gap of term-calculus 
and sequent calculus. 

At last we mention to immediate reduction of axiom cuts which is required 
by tq-protocol. It is easy to see that our substitution semantics agrees with 
immediate reduction of axiom cuts. To be more precise, 

(Xx.t) (fia.[a]h) = Xx.t 
{Xx,((pL))x) (fia.s) = fia.s [a := o;'] 

{Xh.t} {fia.[a]h) = Xh.t 
{Xh,[a']h}{jj.a.s) = jj^aXs [a := a'] 

These /x-name substitution stands for the syntactical replacement of o; by o;'. 

3 CBN CPS-Transform of A^Ltn-Calculus 

In this section, we define the transform from A/xn into A/x^. Our goal is to prove 
that the SN and CR property of A/xn is a consequence of that of A/x^ . 
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Table 3. A/in-Term Assignment for CND 



L: 
Xx.L : 



x: 

r,A^ ^ B, A 
A^B, A 



L: r^B,A^,A 

^ rename 

/x^a.[/3]L: B^A,B^,A 

M: Bo ^ A ^ B, Ao N: A ^ A, Ai 

(MiV): ro,ri ^ A4 \o,Ai 






3.1 Parigot’s A^n-Calculus 

To begin with, we adopt A/in-calculus of Parigot from [9]. It is defined on a nat- 
ural deduction-style logic, called Classical Natural Deduction (CND). Sequents 
of CND is similar to that of LK. The only difference is, it has exactly one unin- 
dexed formula in rhs. The term assignment for CND is displayed in Table 3. 
We refer to the term as A/in-terms, ranged over T, M, N, etc. There are three 
redex rules, namely /?,C and muetaCBN: 



P 

(Ax.M) N ^ M[x:=M] 

{lj*-l\6]M) N IJ*I3\S\M [7 := \x\l3]{x A^)] 

IJ*6.[a'\{iJ* a\l3\L) IJ*6.[[I3\L) [a := a'\ ^ ^ 

In C redex rule, we use one more new notion called /x-name substitution for 
A/xn-calculus. 

Definition 2 (/x-name substitution for A/xn). The jx-name substitution is of 
the form M [7 := Xx.[f3]{x N)]. The result of it is obtained from M by reeursively 
replaeing every named subterm of M of the shape [y]L by [fd]{L N). 

That is, if we interpret named term [j]L as t-application, i.e., as (7 L), /x-name 
substitution could be understood as a standard variable substitution. Note that 
C is essentially identical with Parigot’s structural reduction. 

Definition 3 (A/xn as a reduction system). The reduction relation — 

of Xfifiy viewed as a rewrite system^ is defined to be the compatible closure of the 
notion of reduction defined by three redex rules^ namely^ fi and ( and ja-rf. 

The — defines the full A/x-reduction relation, written — ; its reflexive 
and transitive closure is . We say M is intuitionistic if it is of the form 

of x^Xx.M or (M N). 




A CPS-Transform of Constructive Classical Logic 275 



3.2 The “Naive” CPS-Transform 

The naive transform a A/r^-term M of a A/in-term M is based on the naive, induc- 
tive transform of each CND derivation rule into a sequence of LKT derivation. 

X = jj.a,{a}x 

Xx.L = U'y.\'y](xJJ). {{(fJ))L) 

{M N) = fi/3.((Xh.h{N, f3)}M 

In the above, we use abbreviation ((/3))t for ((Ay.[/?]^))t. We show how is trans- 
formed into an LKT derivation: 

s: ; [(3]h: ■, ^ 

u: h{fia.s,Xh.[f3]h): {A ^ Bf ; ^ B>^ ^ 

T h-cut 

{Xh.h{fia.s^ Xh.[f3]h)) (fij.u) : ; ^ 



3.3 The “Modified” CPS-Transform 

Then, we proceed to the “modified” transform. The intention of this transform 
is, to remove disturbing m-cut during transform. Specifically, we transform the 
k-successive application of the form . {MNi) . . . Nk) into /c-successive 
in LKT. In fact, as we show later, this transform maps normal A/xn-terms to 
normal A/x^ -terms. These m-cut are part of what Plotkin calls administrative 
redexes[10]. Similar observation is also reported by Herbelin[5]. His solution 
was to introduce special term, called argument list. 

Let r be the continuations which is generated from the grammar: r := 
a I Xh.h(fia.s^r). Now we introduce modified transform M of M as follows: 

4 = /x/?.(T:/?) 
fi*a.[l3]L = jia,{L : j3) 



In the first line, L is intuitionistic and /? is a fresh /x-name. Then the infix 
operator “colon (:)” : A/xn-term x A/x-j^-term ^ A/x-j^-term is defined as follows: 

x:r = (r) x 

{Xx.L):j= [j]{x,fJ).{L-.fJ) 

(Xx.L) : Xh.h{^, r) = ((x, !3).(L: /?)) (^, r) 

(M N):r = M:Xh.h{^,r) 

()j,*a.[l3]L):r = {r}(iJ,a.(L : /?)) 
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In the second and third line, we assume L is intuitionistic and /? is a fresh name. 
Otherwise, the it is defined as follows: 

Xx.{fi*f3.[6]L) : 7 = [j]{x, f3). {L : (^) 
\x.{ix*l3.[6]L)-.\h.h{^,r) = {{x,l3).{L:6)){^,r) 

This transform is very similar to Plot kin’s “colon” translation. The difference 
is, we ask the “argument part” N of (M N) to be transformed by this modified 
transform again. This is because we consider full reduction, instead of CBN 
reduction strategy. In the following, we considerably omit the proof for the lack 
of space. 

Lemma 1. if M is normal, then M_ is normal. 

Proof. By induction on the structure of M. 



Lemma 2. If j does not occur in M , then (M : r) [7 := r'] = M : (r [7 := r']). 

Proof. By induction on the structure of M. For example, 

{Xx.L : 7) [7 := Xh.h{^, r)] = {[f]{x, /?). {L : /?)) [7 ;= Xh.h{^, r)] 

= {xJ).{L-.p){K,r) 

= Xx.L : Xh.h{N_, r) 



Lemma 3. ((r))7V — N_:r 

Proof. By cases whether N is intuitionistic or not. 



Lemma 4. L =>AjLit L. 



Proof. By induction on the structure of L. For example, suppose L is of the form 
(M TV), where M = (Mi M2). 



(MTV) = 






iafJ.{Xh.h{N, fJ)))M 
iif3iXh.h{^J3))K i.h. 
fi(3.{Xh.h{K, /3)))(/i7.(M :7)) 
: Xh.h{KX)) 

{M N) 
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3.4 X/j,^ Simulates X/j^n 

In this subsection, we establish the simulation of the normalization for CND by 
tq-protocol for LKT. 

Lemma 5. M [x := ^ 

Proof. By induction on the structure of M. For the base case, 

X [x := TV] = {fia.{a}x) [x := TV] 

= ua.((a}N 

= x[x := TV] 



Lemma 6. {M : r) [x := •= ^ 

Proof. By induction on the structure of M. For the base case, 

{x:r)[x:=m = {r}K ~^x^, K^r 

Lemma 7. {{Xx.L) TV) =^Xf^t ^ 

Proof. Suppose M is intuitionistic. Another case is similar. 

((Ax.L) TV) = /i/?'.((x, /?) . (L : /?)) % /?') 

/i/?'.(L:/?) [x := ^] [/? := /?'] 
^A^, [x :=TV]):/?' 

= L [x := TV] 



Lemma 8. 

M [7 := Xh.h{K.f3)] ^A^, M[j:=Xx.[f3]{xN)] 

{M :r) [7 := Ah.h(^,/3}] {M [r := \x.[p]{x V)]) : (r [7 := Ah.h{^,[3)]) 

Proof. By induction on the structure of M. Suppose M = (Mi M 2 ). 

M[t := Ah.h{K,P)] = /i<f.(M: 7 )[. ..] 

= [7 := Xx.[f3]{x V)]) : Xh.h{N_, (3) 

= /ui.((M[...] iV):/?) 

= /i*(i.[/?](M[...] N) 



= M [-(■.= Xx.[(3]{x N)] 
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From second to third line, we use secondary induction as follows: 

((MiM2):r) [j := Xh.h{KJJ)] = {Mi : Xh.h{^,r)) h ’= Xh.h(KJJ)] 

= {Mi[. . .]) : {Xh.h{M2,r)) [7 := Xh.h{K,f^)] 

= Mi[. . : {Xh.h{ M 2 [7 := Xx,[f3]{x N)] ,r [. . .])) 
= ((MiM 2) [-f:=Xx\l3]{x iV)]):r[...] 

Lemma 9. ((//*7.[^]M) N) [7 := Xx,[f3]{x TV)] 

Proof. By induction on the structure of M. The only interesting case is as follows: 

((/i*7.[h]M) TV) = /./?.(((/.*7.[^]TIV)TV):/?) 

= /i/?.((/i*7.[^]TlV) : Xh.h{K^ f3)) 

= u0.((Xh.h(N , 0)} iu'yM : ^) 

fi0fM:6) [7 := Xh.h{K:f3)] 

= fj.^0.[S]M [7 := Xh.hjN, 0)] 

^A^t fi^0f6]M [7 := Xx\0]{x TV)] 



Lemma 10. /i*o;.[/?](/i*/?'.[^]TlV) — (/i*o;.[^]TlV) [0' := 0] 

Proof, easy. 

Theorem 2 (Simulation), if M — ^a^/^ Tkf =^\n.^ TV. 

Proof. Lemma 7, Lemma 9 and Lemma 10 shows that 0X and fi-rj redex rules 
are correctly simulated by . 

Corollary 1. SN and CR property of is a corollary of that of 

3.5 Relation to De Groote’s Work 

First, we briefly quote our result described in [7, 8]. We have revealed that 
CBN CPS-calculi are exactly the term calculi on the (intuitionistic decoration 
of) LKT; CBN CPS-calculi simulates tq-protocol for LKT. 

Theorem 3 (Simulation). Xp.j-~ calculus can he simulated by X-calculus(i.e., 
Plotkin-style CBN CPS -calculus). 

The transform of A/i|^-term s into Plotkin-style CBN CPS-terms 5 * are fairly 
simple. It can be described briefly as follows: ([o;]s)* = {k s*)^ (((q;))^)* = {s* /c), 
{jaa.s)* = Xk.s"^ and (((t))s)* = (s* t*). A given mapping from /x-names into con- 
tinuation variables such as (A")* = (^A^)^, (B^)* = , ((A ^ R)^)* = 
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{^{A are assumed. For object variable, {A^Y = 

sumed. With this further simulation of our “naive” CPS-transform recovers 

De Groote’s CPS-transform of A/in as follows: 

X = Xk.x k 

Xx.L = Xk".k" (A(x, /c'). (F k')) 

{M N) = Xk'.M {Xh.h{N, k')) 

/i*g.[/?]L = Xk.L k' 

Note that this transform raise “double negation translation” on types. Hence 
we can see that Plotkin’s CPS-transform can be divided into two phases. One 
is the simulation of CND by LKT, and the other is the simulation of LKT 
by its intuitionistic decoration in LJ. Recall that the intuitionistic decoration 
is the homomorphic transform. Thus we know that the simulation of CND by 
LKT is the essential part of the CPS-transform. This fact connects the theory 
of constructive classical logic and CPS-transform. 

4 Conclusions and Further Directions 

Naturally, the CBV system needs further study. One can easily adapt the term 
assignment method (i.e., first part) to the LKQ which is the dual of the LKT. 
From its duality, A/xq-calculus seems to be the candidate to simulate CBV version 
of A/i-calculus. 

Our transform allows A/in-calculus to be interpreted in LKT, hence its se- 
mantics. It should be more understood how these syntactical characterization of 
CBN relate to their (maybe more intrinsic) characterizations using domains, cat- 
egories and more recently games. We should not be satisfied with the embedding 
into the coherent space. 

It is known that A ^ the set of stable morphisms in stable domain theory, 
is the same with (! A) -o B, the set of linear morphisms that linearize the source 
space. The discovery of this splitting of ^ into ! and ^ was a very important 
step towards the Girard’s discovery of linear logic. Our investigation seems to go 
the other way round. It strongly implies the existence of “classical” CBN stable 
domain theory. That is, (!? A) ^(? S), represents CBN stable function. 
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Abstract. We have formally verified the MCS list-based queuing lock 
algorithm (MCS) with CafeOBJ and UNITY. What we have shown is 
that it has the two properties that more than one process can never 
enter their critical section simultaneously and a process wanting to enter 
a critical section eventually enters there. First a simple queuing lock 
algorithm (MCSO) has been specified in CafeOBJ by adopting UNITY 
computational model, and verified with UNITY logic. Secondly a queuing 
lock algorithm (MCSl) specified in the same way as MCSO has been 
verified by showing the existence of a simulation relation from MCSl to 
MCSO with the help of CafeOBJ. Lastly MCS has been derived from a 
slightly modified MCSl. 



1 Introduction 

We have formally verified the MCS list-based queuing lock algorithm (MCS) [6] 
with CafeOBJ [2] and UNITY [1]. MCS is a scalable algorithm for spin locks 
that generates 0(1) remote references per lock acquisition, independent of the 
number of processes (or processors) attempting to acquire the lock. We have 
shown MCS has two properties: one is a safety property, and the other a liveness 
property. The safety property is that more than one process can never enter their 
critical section when MCS is used to implement the mutual exclusion problem. 
The liveness property is that a process wanting to a critical section eventually 
enters there when MCS is used to implement the mutual exclusion problem. 

The verification is divided into three stages. First a simple queuing lock 
algorithm (MCSO) has been specified in CafeOBJ by adopting the parallel com- 
putational model of UNITY, and verified w.r.t. the two properties using UNITY 
logic with the help of CafeOBJ rewrite engine [7]. Secondly a queuing lock al- 
gorithm (MCSl) has been specified in the same way as MCSO. We have then 
shown there exists a simulation relation from MCSl to MCSO with the help of 
CafeOBJ rewrite engine, which indicates MCSl has the safety properties that 
MCSO has [4]. MCSl has also been verified w.r.t. the liveness property using 
the simulation relation and UNITY logic. Lastly MCS has been derived from a 
slightly modified MCSl. 



P.S. Thiagarajan, R. Yap (Eds.): ASIAN’99, LNCS 1742, pp. 281-293, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 
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2 Preliminary 



The parallel computational model of UNITY is basically a labeled transition sys- 
tem. It has some initial states and finitely many transition rules. The execution 
starts from one initial state and goes on forever; in each step some transition rule 
is nondeterministically chosen and executed. The nondeterministic choice is con- 
strained by the following fairness rule: every transition rule is chosen infinitely 
often. Execution of one transition rule may simultaneously change (possibly 
nothing) multiple components of the system. 

UNITY has a mini programming language to represent the parallel compu- 
tational model. A program consists of a declaration of variables, a specification 
of their initial values, and a set of multiple- assignment statements. A multiple- 
assignment statement corresponds to a transition rule. 

UNITY also provides a proof system based on the logic that is an extension of 
Floyd-Hoare logic to parallel programs, with a flavor of temporal logic [5]. The 
logic is based on assertions of the form {p} s{q}^ denoting that execution of state- 
ment s in any state that satisfies predicate p results in a state that satisfies predi- 
cate g, if execution of s terminates. Properties of a UNITY program are expressed 
using assertions of the form {p} s {g}, where s is universally or existentially quan- 
tified over the statements of the program. The properties are classified into a 
safety or a liveness property. Existential quantification over program statements 
is essential in stating liveness properties, whereas safety properties can be stated 
using only universal quantifications over statements (and using the initial condi- 
tion). Although all properties of a program can be expressed directly using asser- 
tions, a few additional terms are introduced for conveniently describing proper- 
ties of programs: unless^ stable^ invariant^ ensures^ and leads-to (or ^). The first 
four terms are defined as follows: p unless q = (Vs : s in E :: {p A ^q} s {p V q}) , 
stable p = p unless false , invariant p = (initial condition ^ p) A stable p , and 
p ensures q = {p unless q) A {3s : s in F :: {pA^q} s {q}) , where E is a program, 
s a statement, and p and q predicates. A given program has the property p ^ q 
if and only if this property can be derived by a finite number of applications of 
the following inference rules: ^ " , and 

for any set W. 

CafeOBJ provides notational machinery to describe labeled transition sys- 
tems, and the corresponding semantics, namely, hidden algebra [3]. In hidden 
algebra, a hidden sort represents (states of) a labeled transition system. Aetion 
operations^ which take the state of a labeled transition system and zero or more 
data represented by visible sorts^ and returns another (possibly the same) state 
of the system, can change the state of a labeled transition system. The state 
of a labeled transition system can be observed only with observation operations 
that take the state of a labeled transition system and returns the value of a data 
component in the system. 
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3 The MCS List-Based Queuing Lock 



The MCS list-based queuing lock (MCS) is a scalable algorithm for spin locks 
that has the following properties: 



— it guarantees FIFO ordering of lock acquisitions; 

— it spins on locally- accessible flag variables only; 

— it requires a small constant amount of space per lock; and 

— it works equally well (requiring only 0(1) network transactions per lock 
acquisition) on machines with and without coherent caches. 

The code in a traditional style is given below: 



type qnode = record 
next : "qnode 
locked : Boolean 
type lock = "qnode 

procedure acquireLock( L : "lock, I : "qnode ) 
I->next := nil 

pred : "qnode := fetch&store( L, I ) 
if pred != nil 

I->locked := true 
pred->next : = I 
repeat while I->locked 

procedure releaseLock( L : "lock, I : "qnode ) 
if I->next = nil 

if comp&swap( L, I, nil ) 
return 

repeat while I->next = nil 
I->next->locked := false 



MCS uses two atomic operations fetchh store and comp k, swap, fetchk, store takes 
as arguments the address of a memory location and a value, and indivisibly 
stores the value into the memory location and returns the old value stored in 
the memory location, comp swap takes as arguments the address of a memory 
location and two values, and indivisibly stores the second value into the memory 
location only if the first value is equal to the old value stored in the memory 
location and returns true if so and false otherwise. 



4 Verification of MCS 



We formally verify MCS w.r.t. the following two points: 

MEl more than one process can never enter their critical section simultaneously 
when MCS is used to implement the mutual exclusion problem; and 
ME2 a process wanting to enter a critical section eventually enters there when 
MCS is used to implement the mutual exclusion problem. 
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4.1 Simple Queuing Lock 

The simple queuing lock MCSO uses two atomic operations with large granularity. 
The operations are atomicPut and atomicGet . atomicPut takes as input a queue 
and a queue item, and indivisibly puts the item into the queue at the end, and 
atomicGet takes as input a queue, and indivisibly deletes the top item from the 
queue. The code for MCSO in a traditional style is given below: 

type queue = record 
head : "qnode 
tail : "qnode 

procedure acquireLock( Q : "queue, I : "qnode ) 

I->next := nil 
atomicPut ( Q, I ) 
repeat while I != Q->head 

procedure releaseLock( Q : "queue, I : "qnode ) 
atomicGet ( Q ) 

Processes can mutually exclusively enter a critical section with acquireLock 
and releaseLock . The code in a traditional style is as follows: 

acquireLockC &queue, &item[i] ) 

Critical Section 

releaseLockC &queue, &item[i] ) 



queue is a global variable, and item[i] is a local variable to each process i. Initially 
queue is empty, that is, queue— > tail is nil. We will later refer this code as ME- 
CODE. Suppose that a process that has entered the critical section eventually 
exits from there. 



Specification We formally specify ME-CODE in CafeOBJ by adopting UNITY 
computational model. Eirst ME-CODE is divided into six atomic actions: 
acquireLock^ setNext^ atomicPut^ checkHead^ releaseLock ^ and atomicGet. Eor 
example, setNext corresponds to {k.item[i])—>next := nil^ and atomicPut to 
atomicPut {Sz queue ^ k,item[i]). Each atomic action corresponds to a UNITY 
assignment-statement. Next let each process have six states: remO^ tryO-1^ tryO- 
2^ tryO-3^ critO^ and exitO-1. Eor example, that a process is in remO means that 
it is executing any code segment but ME-CODE, and a process changes its state 
to try 0-1 whenever it executes acquireLock only if its state is remO. The state 
transition diagram is shown in Eig. 1 (a). 

We specify each atomic action with an action operator of CafeOBJ’s behav- 
ioral specification. In the specification, there are also four observation operators 
for each variable or state in ME-CODE. They are p, head^ tail^ and next. We can 
use p to observe the state of each process. Positive integers are used to identify 
processes, and also their local variables item^s. Let 0 mean nil. The main part 
of the signature of the specification is as follows: 

— Initial system state 
op initO : -> SStateO 

— Observation operators 

bop p : NzNat SStateO -> PStateO 
bops head tail : SStateO -> Nat 
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bop next : NzNat SStateO -> Nat 
— Action operators 

bops acquireLock setNext atomicPut : NzNat SStateO -> SStateO 
bops checkHead releaseLock atomicGet : NzNat SStateO -> SStateO 



SStateO is a hidden sort that represents the state of ME-CODE, and Nat^ NzNat^ 
and PStateO are visible sorts that represent natural numbers, positive integers, 
and the states of processes. initO is any initial state of ME-CODE. 

In this paper, we give only two sets of equations for the two states after a 
process has executed atomicPut and atomicGet^ respectively. We will refer to 
the specification for ME-CODE as MCSO-SPEC. 

— 3. after execution of ‘atomicPut’ 

ceq p (I , atomicPut (I ,S) ) = tryO-3 if p(I,S) == tryO-2 . 

ceq p(J, atomicPut (I ,S) ) = p(J,S) if I =/= J or p(I,S) =/= tryO-2 . 

ceq head (atomicPut (I , S) ) = I if p(I,S) == tryO-2 and tail(S) == 0 . 

ceq head (atomicPut (I , S) ) = head(S) if p(I,S) =/= tryO-2 or tail(S) > 0 . 

ceq tail (atomicPut (I , S) ) = I if p(I,S) == tryO-2 . 

ceq tail (atomicPut (I , S) ) = tail(S) if p(I,S) =/= tryO-2 . 

ceq next(J,atomicPut(I,S)) = I if p(I,S) == tryO-2 and tail(S) > 0 and J == tail(S) . 
ceq next(J,atomicPut(I,S)) = next(J,S) 

if p(I,S) =/= tryO-2 or tail(S) == 0 or J =/= tail(S) . 



— 6. after execution of ‘atomicGet’ 

ceq p (I , atomicGet (I ,S) ) = remO if p(I,S) == exitO-1 . 

ceq p(J, atomicGet (I ,S) ) = p(J,S) if I =/= J or p(I,S) =/= exitO-1 . 

ceq head (atomicGet (I , S) ) = next (head (S) ,S) 

if p(I,S) == exitO-1 and next (head (S) , S) > 0 . 
ceq head (atomicGet (I , S) ) = head(S) if p(I,S) =/= exitO-1 or next (head(S) , S) == 0 . 

ceq tail (atomicGet (I , S) ) =0 if p(I,S) == exitO-1 and next (head (S) , S) == 0 . 

ceq tail (atomicGet (I , S) ) = tail(S) if p(I,S) =/= exitO-1 or next (head(S) , S) > 0 . 

eq next(J,atomicGet(I,S)) = next(J,S) . 

Verification We formally verify MCSO-SPEC w.r.t. MEl and ME2 with 
UNITY logic and CafeOBJ. Eirst MEl and ME2 are restated more formally. 
Let Pi be the state of a process i. 

1. invariant pi = critO Apj = critO i = j ; and 

2. Pi = try 0-1 ^ Pi = critO . 

Proof sketch 1: We actually prove the more powerful property that is given below: 

invariant pi G {critO ^ exitO-1} Apj G [critO ^ exitO-1} i = j ^ 

where x G {ui, . . . means x = vi^ ^ or x = v^. Suppose the property 
invariant pi G [critO, exit 0-1} head = A holds, the desired property can be 

derived. Thus we prove the assumed property. In the initial state, the predicate 
is vacuously true. Thus it is sufficient to show that a process i changes its state 
to critO whenever it executes checkHead only if its state is tryO-3 and head = i, 
and that head does not change unless the process i in critO or cxitO-1 changes its 
state to remO. In this paper, only the proof score for the latter case is shown. In 
the proof, suppose another property ^ invariant pi G {critO, exitO-1} tail fy niV 
holds, which can be shown in the same way. The proof score is given below: 
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open MCSO-SPEC 

op s : -> SStateO . ops ijlj2j3j4 : -> NzNat . 

eq p(i,s) = critO . eq p(jl,s) = remO . eq p(j2,s) = tryO-1 . 

eq p(j3,s) = tryO-2 . eq p(j4,s) = tryO-3 . 

eq head(s) = i . eq tail(s) > 0 = true . eq j3 > 0 = true . 
red head(acquireLock(jl , s)) == i and tail (acquireLock( j 1 , s) ) > 0 . 
red head(setNext (j2,s)) == i and tail(setNext (j2 , s) ) > 0 . 
red head(atomicPut(j3,s)) == i and tail (atomicPut (j3 , s) ) > 0 . 
red head(checkHead(j4,s) ) == i and tail(checkHead(j4,s)) > 0 . 
red head(releaseLock(i,s)) == i and tail (releaseLock(i , s) ) > 0 . 
close 

We can show the latter case that head does not change unless the process i 
in critO or exitO-1 changes its state to remO by having CafeOBJ rewrite engine 
execute the proof score. □ 

Proof sketch 2: It is easy to show the property ^pi = tryO-1 ^ Pi = tryO-3^ holds. 
Thus it is sufficient to show the property ^pi = tryO-3 ^ Pi = critO ^ holds. From 
these two properties, the desired one can be obtained. 

First we define a partial function d that takes as arguments two queue items, 
and returns the distance between them in the queue if both of them are in a 
queue and the first item is not in the rear of the second one in the queue. It is 
defined as follows: 






0 if i = j 

1 + d{i—>next^ k) otherwise . 



To prove the property, it is sufficient to show the following one: 

Pi = tryO-3 f\d{head ^i) = /c ^ (pi = tryO-3 Ad{head ^ i) < k)\/pi 



critO . 



By applying the induction principle for leads-to to this one, the desired property 
^Pi = tryO-3 ^ Pi = critO ^ can be derived. This property is then proven. 

If d{head^i) = 0, it is easy to show ^pi = try 0-3 ^ Pi = critO^ since head = i. 
So, suppose d{head^i) > 0. If so, there must be exact one process j such that 
head = j. The process j must be in try 0-3^ critO ^ or cxitO-1. All we have to do is 
to show that only the process j whose state is cxitO-1 decrements d{head^i)^ but 
any other process does not change d{head^i). The two cases that a process k in 
try 0-2 executes atomicPuk and the process j in cxitO-1 executes atomicGet are 
more interesting than others. Only the two cases are handled here. In the former 
case it is shown that neither head nor the next field of a non-tail item i in the 
queue changes, and in the latter case it is shown that head changes to the second 
item in the queue, but the next field of a non-tail item i does not change. In 
both cases, the process i does not change its state. By having CafeOBJ rewrite 
engine execute the following proof score, they are shown. 

open MCSO-SPEC 

op s : -> SStateO . ops i j k : -> NzNat . 

eq p(i,s) = tryO-3 . eq p(j,s) = exitO-1 . eq p(k,s) = tryO-2 . 

eq head(s) = j . eq tail(s) > 0 = true . eq next(j,s) > 0 = true . 

red next (i , atomicPut (k, s) ) == next(i,s) and next (i,atomicGet (j ,s) ) == next(i,s) . 

red head(atomicPut (k, s) ) == head(s) and head (atomicGet (j , s) ) == next (head (s) , s) . 

red p(i, atomicPut (k,s)) == tryO-3 and p(i, atomicGet (j ,s)) == tryO-3 . 

close 

□ 
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4.2 Queuing Lock 

We prove the queuing lock MCSl has the two properties MEl and ME2 by 
showing there exists a simulation relation from MCSl to MCSO. MCSl uses the 
same atomic operations as MCS. The code for MCSl in a traditional style is 
given as follows: 

procedure acquireLock( Q : "queue, I : "qnode ) 

I->next := nil 

pred : "qnode := fetch&store( &Q->tail, I ) 
if pred = nil 

Q->head := I 

else 

pred->next : = I 
repeat while I != Q->head 

procedure releaseLock( Q : "queue, I : "qnode ) 
if Q->head->next = nil 

if comp&swap( &Q->tail, Q->head, nil ) 
return 

repeat while Q->head->next = nil 
Q->head := Q->head->next 

Specification We formally specify ME-CODE with MCSl instead of MCSO 
in the same way as ME-CODE with MCSO. The specification is referred as 
MCSl-SPEC. Each process in MCSl-SPEC has 10 states: reml, tryl-1, tryl-2, 
tryl-3^ tryl~4^ critl^ exitl-1^ exitl-2^ exitl-3^ and exitl~4- There are 10 atomic 
actions: acquireLock^ setNext^ fetch^store^ checkPred^ checkHead^ releaseLock^ 
checkHNextl^ comp&^swap^ checkHNext2^ and setHead. The state transition dia- 
gram is shown in Eig. 1 (b). The main part of the signature of MCSl-SPEC is 
as follows: 

— Initial system state 
op initl : -> SStatel 

— Observation operators 

bop p : NzNat SStatel -> PStatel 
bops head tail : SStatel -> Nat 
bops next pred : NzNat SStatel -> Nat 

— Action operators 

bops acquireLock setNext fetch&store checkPred checkHead : NzNat SStatel -> SStatel 
bops releaseLock checkHNextl comp&swap checkHNext2 setHead : NzNat SStatel -> SStatel 

In this paper, we give only two sets of equations for the two states after a process 
has executed fetchk^store and compkswap^ respectively. 

— 3. after execution of ‘fetch&store’ 

ceq p(I,fetch&store(I ,S)) = tryl-3 if p(I,S) == tryl-2 . 

ceq p(J,fetch&store(I ,S)) = p(J,S) if I =/= J or p(I,S) =/= tryl-2 . 

eq head(f etch&stored , S) ) = head(S) . 

ceq tail (f etch&stored , S) ) = I if p(I,S) == tryl-2 . 

ceq tail (fetch&store (I , S) ) = tail(S) if p(I,S) =/= tryl-2 . 
eq next ( J ,f etch&store (I , S) ) = next(J,S) . 

ceq predd ,f etch&store (I , S) ) = tail(S) if p(I,S) == tryl-2 . 

ceq pred(J ,f etch&store (I , S) ) = pred(I,S) if I =/= J or p(I,S) =/= tryl-2 . 

— 8. after execution of ‘comp&swap’ 

ceq pd , comp&swapd ,S) ) = reml if p(I,S) == exitl-2 and tail(S) == head(S) . 

ceq pd , comp&swapd ,S) ) = exitl-3 if p(I,S) == exitl-2 and tail(S) =/= head(S) . 

ceq p(J, comp&swapd ,S) ) = p(J,S) if I =/= J or p(I,S) =/= exitl-2 . 

eq head (comp&swap (I , S) ) = head(S) . 
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Fig. 1. Correspondence between states in MCSO and MCSl 



ceq tail (comp&swapd , S) ) =0 if p(I,S) == exitl-2 and tail(S) == head(S) . 

ceq tail (comp&swapd , S) ) = tail(S) if p(I,S) =/= exitl-2 or tail(S) =/= head(S) . 
eq next(J,comp&swap(I,S)) = next(J,S) . 
eq pred(J,comp&swap(I,S)) = pred(J,S) . 



Verification We formally verify MCSl-SPEC w.r.t. MEl and ME2. First we 
prove MCSl-SPEC has MEl by showing there exists a simulation relation from 
MCSl-SPEC to MCSO-SPEC with the help of CafeOBJ rewrite engine. Next we 
prove MCSl-SPEC has ME2 with UNITY logic and the simulation relation. 

Proof sketch 1: We make a mapping from each state in MCSl-SPEC to some 
state in MCSO-SPEC. Given a state of MCSl-SPEC, that is, each process’s 
state, next and pred, and global head and tail, we have some corresponding state 
of MCSO-SPEC, that is, each process’s state and next, and global head and tail. 

Given a state of MCSl-SPEC, each process’s state in MCSO-SPEC corre- 
sponding to the MCSl-SPEC state is defined as shown in Fig. 1. The correspon- 
dence is given in terms of equations. Only two equations are shown here: 

ceq pd, sim(S)) = tryO-3 if p(I, S) == tryl-3 or p(I, S) == tryl-4 . 
ceq pd, sim(S)) = critO if p(I, S) == critl . 

Even if a process i in MCSl-SPEC executes some action, the others do no change 
their states. In the corresponding situation in MCSO-SPEC, all processes but 
i do not change their states either. Thus we also have the equations such as 

^ceq p(J, sim(acquireLock(I , S))) = p(J, sim(S)) if I =/= J .’. 

Given a state of MCSl-SPEC, the corresponding head in MCSO-SPEC is i 
if there exists a process i in MCSl-SPEC such that its state is tryl-3 and its 
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pred is m/, and otherwise it is the same as that in MCSl-SPEC. It is sufficient 
to have the following two equations for mapping each state in MCSl-SPEC to 
head in MCSO-SPEC: 

ceq head(sim(S) ) = I if p(I, S) == tryl-3 and pred(I, S) == 0 . 

ceq head(sim(S) ) = head(S) if p(I, S) =/= tryl-3 or pred(I, S) > 0 . 

However, since the equations have a variable that does not appear in the left- 
hand sides, the proof process becomes complicated if these equations are used. 
So, instead of the two equations, we have equations that indicate how head in 
MCSO-SPEC changes when a process executes an action in MCSl-SPEC. Some 
of the equations are given here: 

eq head(sim(setNext (I , S))) = head(sim(S) ) . 

ceq head(sim(f etch&stored , S))) = I if p(I, S) == tryl-2 and tail(S) == 0 . 

ceq head(sim(f etch&stored , S))) = head(sim(S)) if p(I, S) =/= tryl-2 or tail(S) > 0 . 

The first equation, and the second and third equations indicate how head in 
MCSO-SPEC changes if a process in MCSl-SPEC executes setNext and fetchSz 
store^ respectively. The second equation specifies head in MCSO-SPEC is set to 
i whenever a process i in MCSl-SPEC executes fetchk^store only if its state is 
tryl-2 and tail is nil^ that is, the queue is empty. In this case, the corresponding 
action in MCSO-SPEC is atomicPut as shown in Eig. 1. 

It is straightforward to map a state in MCSl-SPEC to tail in MCSO-SPEC. 
Given a state of MCSl-SPEC, the corresponding tail in MCSO-SPEC is the same 
as that in MCSl-SPEC. 

Given a state of MCSl-SPEC, the corresponding next of a process j in MCSO- 
SPEC is i if there exists a process i in MCSl-SPEC such that its state is tryl-3 
and its pred is equal to j, and otherwise it is the same as that in MCSl-SPEC. 
It is sufficient to have the following two equations for mapping each state in 
MCSl-SPEC to each next in MCSO-SPEC: 



ceq next (J,sim(S)) = I if p(I,S) == tryl-3 and pred(I,S) > 0 and J == pred(I,S) . 
ceq next (J,sim(S)) = next(J,S) 

if p(I,S) =/= tryl-3 or pred(I,S) == 0 or J =/= pred(I,S) . 



As is the case for head^ however, the proof process becomes complicated if these 
equations are used. So, instead of the two equations, we have equations that 
indicate how each next in MCSO-SPEC changes when a process executes an 
action in MCSl-SPEC. Some of the equations are given here: 

ceq next (I, sim (setNext (I, S))) =0 if p(I,S) == tryl-1 . 

ceq next (J, sim (setNext (I, S))) = next (J, sim(S) ) if I =/= J or p(I,S) =/= tryl-1 . 

ceq next(J,sim(fetch&store(I,S))) = I 

if p(I,S) == tryl-2 and tail(S) > 0 and J == tail(S) . 
ceq next(J,sim(fetch&store(I,S))) = next (J , sim(S) ) 

if p(I,S) =/= tryl-2 or tail(S) == 0 or J =/= tail(S) . 



The first and second equations, and the third and fourth equations indicate 
how each next in MCSO-SPEC changes if a process in MCSl-SPEC executes 
setNext and fetehkstore^ respectively. The third equation specifies the process 
j’s next in MCSO-SPEC is set to i whenever a process i in MCSl-SPEC executes 
fetchk store only if its state is tryl-2^ tail is not nil^ and tail is equal to j. 
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Now that we have defined the mapping sim from each state in MCSl-SPEC 
to some state in MCSO-SPEC, we give a candidate R for a simulation relation 
from MCSl-SPEC to MCSO-SPEC. Here the signature and equation for the 
candidate is shown: 

op _R[_]_ : SStateO NzNat SStatel -> Bool 

eq SO R[I] SI = p(I,S0) == p(I ,sim(Sl)) and head(SO) == head(sim(Sl) ) and 

tail(SO) == tail (sim(Sl) ) and next(I,S0) == next (I , sim(Sl) ) . 

Then we prove the candidate is a simulation relation from MCSl-SPEC to 
MCSO-SPEC with CafeOBJ. It is easy to show initO and initl are under the 
candidate, i.e. initO R[i] initl for any process i. Thus we show the candidate is 
preserved if any process executes any action in MCSl-SPEC. First we define two 
states, one for MCSO and the other for MCSl, that are under the candidate as 
follows: 

mod SIMRELl-PROOFl { 
pr(SIMRELl) 

ops i j : -> NzNat op sO : -> SStateO op si : -> SStatel 
eq p(i,s0) = p(i,sim(sl)) . eq p(j,s0) = p(j,sim(sl)) . 
eq head(sO) = head(sim(sl) ) . eq tail(sO) = tail (sim(sl) ) . 
eq next(ijSO) = next (i , sim(sl) ) . eq next(j,s0) = next ( j , sim(sl) ) . 

> 

SIMRELl is the module in which the candidate is defined. We give some proof 
scores that show the candidate is preserved if a process i executes some action. 

— 3. after execution of ‘f etch&store ’ 

— 3.1 in the case that ‘tail (si) = 0’ 
open SIMRELl- 1-PROOFl 

eq p(i,sl) = tryl-2 . eq tail(sl) = 0 . 
red atomicPut (i , sO) R[i] fetch&store(i,sl) . 
red atomicPut (i , sO) R[j] fetch&store(i,sl) . 
close 

— 3.2 in the case that ‘tail (si) > 0’ 
open SIMRELl- 1-PROOFl 

eq p(i,sl) = tryl-2 . eq tail(sl) > 0 = true . 
red atomicPut (i , sO) R[i] fetch&store(i,sl) . 
red atomicPut (i , sO) R[j] fetch&store(i,sl) . 
close 



The above two proof scores show the relation R between 5 O and 5 I is preserved 
after a process i in tryl-2 executes fetchk. store in si and the corresponding 
process i executes atomicPut in sO. The former score is the case where tail = nz/, 
and the latter the case where tail ^ nil. In both scores, j denotes any process 
but i. Even if any process whose state is not tryl-2 executes fetchk, store in si, 
nothing changes. Thus in that case the relation is vacuously preserved. 



— 8. after execution of ‘comp&swap’ 

— 8.1 in the case that ‘head(sl) = tail (si)’ 
open SIMRELl- 1-PROOFl 

eq p(i,sl) = exitl-2 . op k : -> NzNat . eq tail(sl) = k . 
eq head(sl) = tail(sl) . eq head(sim(sl)) = tail (sim(sl) ) . 

eq next(k,sl) = 0 . eq next (k,sim(sl) ) = 0 . eq next(k,s0) = next (k, sim(sl) ) . 
red atomicGet (i , sO) R[i] comp&swap(i , si) . 
red atomicGet (i , sO) R[j] comp&swap(i , si) . 
close 

— 8.2 in the case that ‘head(sl) =/= tail(sl)’ 
open SIMRELl- 1-PROOFl 

eq p(i,sl) = exitl-2 . 
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red sO R[i] comp&swap(i ,sl) . 
red sO R[j] comp&swap(i ,sl) . 
close 



The above two proof scores show the relation R between sO and si is preserved 
after a process i in exitl-2 executes comp swap in si and the corresponding 
process i in sO executes atomicGet or does nothing depending on whether head = 
tail in si (and also sO). 

Since we have proven the candidate is a simulation relation from MCSl- 
SPEC to MCSO-SPEC as described above, we can say MCSl-SPEC also has the 
safety properties such as MEl that MCSO-SPEC has. □ 

Proof sketch 2: It is sufficient to show ^pi = tryl~4 ^ Pi = critl ’ in MCSl-SPEC 
in order that MCSl-SPEC has the liveness property ME2. Eor this sake, all we 
have to do is to prove the following property in MCSl-SPEC: 

Pi = tryl~4Ad{head^i) = k {pi = tryl~4Ad{head^i) < k)\/pi = critl . 

Since we have shown there is a simulation relation from MCSl-SPEC to 
MCSO-SPEC, mapped critl to critO and each of exitO-1 through exitl~4 to exitO- 
i, and proven there is at most one process in critO and exitO-1 in MCSO-SPEC, 
there must be at most one process in critl ^ exitl-1^ exitl-2^ exitl-3 and exitl~4- 

We show a process in one of critl through exitl~4 eventually reaches exitl- 
4 if there is at least one process in tryl~4- We here handle only the case that 
a process is in exitl-3. A process in critl or exitl-1 has to go through exitl-2 
in order to reach exitl-3. A process in exitl-1 (or exitl-2) changes its state to 
exitl-2 (or exitl-3) whenever it executes checkHNextl (or comp&^swap) only if 
head— > next = nil, i.e. head = tail (or head ^ tail, that is, another process k 
has executed fetchk. store before the process executes comp k. swap). In the latter 
case, the /c’s pred is equal to head, and head— > next eventually becomes k that 
is non- nil. Thus a process in exitl-3 eventually reaches exit 1-4. 

Since there is at most one process j in critl through exitl-4^ head exactly 
changes to the second element in the queue whenever the process j reaches reml 
if there are at least two elements in the queue, that is, at least one process is in 
try 1-4^ before the process j reaches reml. Moreover since there is a simulation 
relation from MCSl-SPEC to MCSO-SPEC, we can say the next fields of non- 
tail elements in the queue corresponding to processes in try 1-4 do not change at 
least until the processes reach reml. Consequently the desired property can be 
obtained. □ 

4.3 MCS Queuing Lock 

Eirst qnode^s locked field is used to slightly modify MCSl. The modified version 
is called MCS2 whose code in a traditional style is as follows: 

procedure acquireLock( Q : "queue, I : "qnode ) 

I->next := nil 

pred : "qnode := fetch&store( &Q->tail, I ) 
if pred = nil 
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I->locked := false 
Q->head := I 

else 

I->locked := true 
pred->next : = I 
repeat while I->locked 

procedure releaseLock( Q : "queue, I : "qnode ) 
if Q->head->next = nil 

if comp&swap( &Q->tail, Q->head, nil ) 
return 

repeat while Q->head->next = nil 
Q->head, Q->head->next->locked := Q->head->next , false 

In the above code, x, y := V 2 means x and y are synchronously set to v\ and 
V 2 ^ The specification for ME-CODE with MCS2 in CafeOBJ is called MCS2- 
SPEC. We can show there exists a simulation relation from MCS2-SPEC to 
MCSl-SPEC, and then verify MCS2-SPEC w.r.t. MEl in the same way as 
the verification of MCSl-SPEC. Besides MCS2-SPEC can be also verified w.r.t. 
ME2 using the simulation relation and UNITY logic in the same way. 

In acquireLock of MCS2, the repeat while statement can be lifted up to the 
else clause of the if statement because the repeat while statement is meaning- 
less if pred = nil^ that is, if so, I— > locked is set to false. In releaseLock of MCS2, 
the last assignment can be reduced to Q— > head— > next— > locked := false. In 
addition the second argument /must be equal to Q—>head because of the proof 
of the safety property of MCSl and the existence of a simulation relation from 
MCS2-SPEC to MCSl-SPEC. Hence all Q—>head's can be changed to the sec- 
ond argument I. Moreover the two assignments can be deleted from the if clause 
in acquireLock of MCS2. Consequently we can derive the MCS list-based queuing 
lock by finally modifying the first argument and hQ—>tail into L : ^lock and T, 
respectively. 

5 Conclusion 

We have described how MCS is verified w.r.t. MEl and ME2 with CafeOBJ 
and UNITY. The verification is divided into three stages. Eirst MCSO specified 
in CafeOBJ by adopting UNITY computational model has been verified w.r.t. 
MEl and ME2 using UNITY logic with the help of CafeOBJ. Secondly MCSl 
specified in the same way as MCSO has been verified w.r.t. MEl by showing the 
existence of a simulation relation from MCSl to MCSO, and verified w.r.t. ME2 
using UNITY logic and the simulation relation. Lastly MCS has been derived 
from MCS2 that is a slightly modified MCSl. 
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Abstract. Normally a whole BDD is associated with its root node. We 
state that this is not necessarily the best choice. BDDs can be stored 
more memory efficient if a possibility is introduced that one node can 
represent several BDDs. 

BDDs made it possible to apply symbolic model checking [15] to systems 
which could not be verified with methods based on an explicit state space 
representation. However, they cannot always avoid an exponential blow- 
up known as the state space explosion which is caused by the fact that 
the system is build through a composition of modules. 

We divide the variable index placed in the BDD nodes into two parts. 
Only one of it is included in the new nodes called CBDD-nodes. This 
allows the use of one BDD label, i.e. one BDD node, for different vari- 
ables. Moreover a branching possibility at the end of a BDD-part is 
introduced such that BDD-parts situated in different layers of the BDD 
can be shared. 

A prototypical BDD-library was implemented that uses cyclic BDDs 
(CBDDs). It was used to verify examples out of the following domains: 
hardware verification, model checking and communication protocols. 
Some encouraging results have been achieved where it was possible to 
represent certain functions with a constant number of CBDD-nodes while 
classical BDDs grow linearly. 



1 Introduction 

BDDs are the state-of-the-art representation for boolean functions. They al- 
low the application of symbolic model-checking to systems with 10^^ states and 
beyond [6] . Most of the systems are build out of several distributed modules com- 
municating with each other. As all sequences of the system have to be consid- 
ered, the construction of the product automata typically leads to an exponential 
blow-up in the number of states. 

Memory efficiency is a central problem of existing model-checkers. Although 
BDDs lead to a compact system representation, they can grow exponentially in 
the size of the system causing an overflow of the main memory. The transition 
relation and the set of reachable states are represented as BDDs. Model checking 
temporal logic properties can now be reduced to the calculation of fix-points for 
functions which represent sets of states. These functions are represented as BDDs 
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and so the calculations can be done very efficiently treating many states in each 
iteration step. 

Often a system contains several modules of the same structure. For example 
in hardware verification for slightly different purposes the same standard module 
is used or in protocol verification a common protocol is used by all communicat- 
ing stations. In this case, the relation describing the transitions of one module 
has the same structure for a set of modules. As each one is represented with dif- 
ferent variables they appear in different layers of the BDD and therefore cannot 
be shared using the reduction rules of classical BDDs. 

The aim of our approach is to provide a possibility to share nodes in different 
layers. This variation of classical BDDs permits a representation with less nodes 
while the efficiency of the BDD-algorithms is not affected. As the approach only 
tries to reduce the memory consumption there is no direct gain of time. But if 
the verification of a system exhausts the whole main memory and the secondary 
memory has to be used, the additional reduction of space turns into a significant 
reduction of time if it allows to perform the computations within the given 
memory limitations. Therefore, it is not useful to apply this method to small 
systems which can easily be verified with low main memory requirements. On 
the other hand, with a growing size of a system it becomes more interesting to use 
space reduction techniques like the one presented here even when some auxiliary 
time is spent on the determination of the additional reduction. Moreover the 
experiments will show that the possibility of an additional reduction increases 
with the size of a system. 

The paper is organized as follows. In the next section, a short survey about 
some existing BDD variations and related work is given. In section 3 some pre- 
liminaries about BDDs and model-checking are explained while in section 4 the 
new structure of CBDDs is introduced. Finally, some experimental results are 
presented in section 5. 



2 Related Work 



There were many attempts of BDD-variations trying to overcome the weaknesses 
of BDDs to represent for example a multiplication or the hidden-weighted-bit 
function which require an exponential number of BDD-nodes. 

Free BDDs: Free BDDs [12] use different variable orderings on different 
paths. They are a generalization of BDDs. Instead of a fixed ordering, an FBDD- 
type describes the different orderings on different paths. The FBDDs are con- 
structed according to a fixed type which is also a graph. The problem is that 
no efficient way is known to determine an optimal or at least a “good” type in 
general. 

Zero-suppressed BDDs: Zero-suppressed BDDs [17] use a different reduc- 
tion rule. A node is omitted if its 0-successor leads to false and not when its 
successors are identical. Depending on the function which is represented with 
either reduction rule a smaller representation is possible. 




296 



Frank Reffel 



Indexed BDDs: Indexed BDDs [3] allow variables to be tested several times 
on a path. This structure allows a very compact representation of the hidden- 
weighted-bit function but uniqueness is lost. The algorithms for the manipulation 
of IBDDs and the test of equality become more complicated. 

Differential BDDs (Z\-BDDs): In differential BDDs [1], the nodes do not 
include the index of the variables but the difference of indices to the upper 
variable. This allows a reduction of nodes in different layers as proposed in our 
approach. The difference is that two BDD-parts in different layers can only be 
shared if they are both at the bottom of the Z\-BDD. They are unique but a 
transformation of a BDD into a Z\-BDD can also lead to a blow-up in the number 
of nodes. 

Function descriptors (for LIFs): Function descriptors (FDs) [13] are a 
BDD-like representation for linearly inductive functions (LIFs). It is possible to 
share parts in different layers and they can be manipulated very efficiently. The 
drawback of the method are the strong regularity constraints. All modules of a 
system must have exactly the same structure and there must be a way to arrange 
them hierarchically. 

BDD trees: BDD trees were introduced in [16]. They are a generalization 
of BDDs where the linear ordering is replaced by a tree order. A decomposition 
of the system has to be found which in case of a hierarchical system can be 
determined easily. McMillan managed to match the same complexity as classical 
BDD-algorithms but the algorithms for the determination of logical functions 
have to be changed and become more complicated. BDDs appearing in different 
layers cannot be shared and the method is only suitable for hierarchical systems. 

Symmetry and abstraction: Symmetry [9, 7] and abstraction [8] are very 
powerful methods to reduce the necessary resources for the verification of a 
system. The problem is that it has to be proven manually that the application 
of a certain abstraction or symmetry relation preserves the correctness of the 
verification. To exploit symmetry, the orbit relation and a representative for 
each orbit have to be determined, which can lead to very complex computations 
and demand an experienced user. 

Other approaches try to prove properties for systems with many identical 
processes [4] or rings of processes [11]. These methods also require the man- 
ual proof of preconditions and make restrictions to both the system and the 
properties which can be shown. 

Our approach provides a different way to reduce memory consumption. Tak- 
ing advantage of the modular structure of a system only the potential structural 
similarities have to be indicated and there is no need to prove any precondi- 
tions. Depending on the structure of the BDD the existent structural identities 
are found automatically, if they exist, and a more compact representation of the 
BDD is achieved. 
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3 Preliminaries 

3.1 BDDs 

Ordered binary decision diagrams (OBDDs) introduced by Bryant [5] are a 
graphical representation for boolean functions. An OBDD G(/, tt) with respect 
to the function / and the variable ordering tt is an acyclic graph with one source 
and two sinks labeled with true and false. All other (internal) nodes are labeled 
with a boolean variable Xi of / and have two outgoing edges left and right For 
all edges from an Xi labeled node to an Xj labeled node, we have 7r(i) < 7r(j), 
such that on every path in G the variables are tested in the same order and each 
variable is tested at most once. 

Reduced OBDDs with respect to a fixed variable ordering are a canonical 
representation for boolean functions. An OBDD is reduced if isomorphic sub- 
BDDs exist only once and nodes whose outgoing edges lead to the same successor 
are omitted. The reduced OBDD is build directly, integrating the reduction rules 
into the construction to avoid the construction of the full graph. 

The variable ordering tt can be chosen freely but it has a great influence on 
the size of the OBDDs, e.g. there are functions which have OBDDs of linear 
size for a “good” and of exponential size for a “bad” ordering. To determine the 
optimal ordering is an NP-hard problem but there exist heuristics which deliver 
good orderings for most applications [2]. 

In the following, we will only speak of BDDs, however we always mean re- 
duced OBDDs. 

As already mentioned, the variable ordering has a great influence on the size 
of the BDDs. A popular heuristic is to place variables depending on each other 
close to each other in the ordering. So we define blocks of variables where each 
block corresponds to the variables of one module, because it is assumed that the 
dependence inside a module is stronger than to variables outside the module. 
The blocks are ordered in a style that communicating modules appear as close 
as possible in the ordering. The ordering of the variables inside one block can be 
chosen freely or determined with other heuristics, but in the following it will be 
important that the orderings are identical for similar modules. 

Since we intend to share BDD-parts in different levels of the BDD we will 
first define what is meant exactly with a BDD-part: 

Definition 1 (BDD-part). 

A BDD-part identified with the node Xi is the part of the whole BDD whieh 
starts at node Xi and ineludes all nodes reachable on a path from Xi of the same 
module. 

A maximal BDD-part is a BDD-part whieh has a parent node whieh belongs 
to another module. 

We use the convention that the /a/se-leaf belongs to a BDD-part if it is reach- 
able from one of its nodes. In contrast, the true-lesif is an independent BDD-part 
and does not belong to any other BDD-parts. This is necessary for the following 
inductive definition of structural identity for BDD-parts. The motivation for the 
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different treatment of the two leafs is that in most applications an edge to false 
from an upper level of the BDD is much more frequent than an edge leading 
directly to true. The definition of structural identical BDD-parts is necessary to 
describe the additional reduction provided by CBDDs. 

Definition 2 (Structural identity). Two BDD-parts p and q are structural 
identical with distanee k (denoted by \dk{p^q)), if and only if: 

— p = q = false or 

— var(p) = Xi and var(g) = Xij^k the sueeessors satisfy the following eon- 
ditions: 

• idfe(l(p), 1(g)) or both do not belong to p, resp. q and 

• idfe(0(p), 0(g)) or both do not belong to p, resp. q 

where 0(s) and l(s) designate the 0- and the 1-sueeessor of node s, respee- 
tively and var(s) the variable index of s. 



3.2 Model Checking 

In the following, we will suppose that the system consists of identical modules 
only, but in fact our approach can be used whenever there exist at least some 
modules which are represented by isomorphic sets of variables. The only require- 
ment we make is that there exists a partitioning of the system into modules. 

A state s of the whole system corresponds to the product of states of the single 
modules: s = {si,...,Sn}. The transition relation Trans{s^t) is constructed 
using the transition relations of the single modules Trans i{s^ ti). Such a relation 
changes the state of module i from Si to ti. As it can communicate with other 
modules this transition can depend on the states of all modules, hence the whole 
state s serves as input to Trans i. 

In the simplest case, all modules have the same structure. Thus, the same 
transition relation Trans i is used for all of them but with different input vari- 
ables. 

There are three types of finite state machines: The synchronous, the asyn- 
chronous and the interleaving model. They differ slightly in the structure of 
their transition relation but all models have in common that their transition 
relation depends on the relations of the single modules. The following expression 
describes exemplarily the structure of the transition relation for an interleaving 
model where CoStabi fixes the states of all modules except module i. 

Trans {s ^ t) = \J Transi(s^ti) A CoStabi(s^t) (I) 

For model checking of any kind of temporal properties the BDDs needed 
beside this transition relation describe sets of states, for example the set of 
reachable states. 
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4 Cyclic BDDs (CBDDs) 

CBDDs try to be more memory efficient than classical BDDs through a better 
exploitation of structural similarities inside the BDD. Concentrating on systems 
with similar modules structural identical BDD-parts appearing in different lay- 
ers of the BDD should be represented only once and used several times. For 
this purpose mainly two characteristics are necessary which are not offered by 
classical BDDs: 

1. One node should have the potential to represent different variables, so the 
labels of the nodes must be different from variables. 

2. A possibility has to be introduced which allows to branch to different nodes 
after the use of a common BDD-part. 



4.1 Labels of CBDD-Nodes 

The idea is to split a BDD into layers according to the modules. Suppose that 
the maximal number of variables in one module is m. In a classical BDD the 
nodes could be labeled with the values i * m + /c where i is the index of the 
module, and /cG{0, — l}is the index of the variable inside the module. 

In a CBDD-node the part is omitted and the nodes are only labeled with 
the index /c. In a BDD the knowledge of the root node was sufficient to charac- 
terize a boolean function. Now in addition it is necessary to know the module 
this node belongs to. A boolean function can be derived from the combination 
of a node and its module index. 

When an evaluation path through a CBDD is traced the module index has to 
be adjusted when the successor node belongs to a different module. A marking 
on such an edge would be sufficient indicating that the module index has to 
be increased by one. In case that two nodes belong to non-adjacent modules 
the index has to be increased by a number greater than one. Therefore we will 
assume that the marking is realized by an auxiliary special node placed on an 
edge between two nodes of different modules; it contains the information for the 
determination of the module index for the next node. The reason why we use a 
node and not a more expressive marking will be explained in Section 4.2 where 
these special nodes are used for an additional functionality. 

Until now a CBDD is a BDD where blocks of nodes are placed side by side 
instead of one among another. The nodes are labeled with a restricted set of 
values and some edges include additional information to determine the module 
to which the next node of the CBDD belongs to. 

As the restricted set of labels can be represented by less bits the size of 
a normal CBDD-node is smaller than a BDD-node. Such a restriction of the 
label representation can also be an approach to address more nodes without 
increasing the node-size since more bits remain for the encoding of the pointers 
to the successors. 
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4.2 Branching Points 

As introduced in the previous section nodes which belong to different modules 
are separated by a special node. These nodes can be used as branching points. 
When a structural identical BDD-part appears in different levels of the CBDD 
it is not necessary to replicate it twice. It suffices to introduce an edge which 
leads to the root node of that BDD-part which is originally used in a different 
level. As this edge can correspond to a back-edge a cycle is introduced into the 
BDD explaining the name: Cyclic BDDs. 

Following a path through a CBDD after each BDD-part a special node is 
encountered. These special nodes are not anymore only an interruption of an 
edge but several edges can start from them each labeled with a module index. 
When a BDD-part is used several times additional edges have to be appended 
to the special nodes at the end of this BDD-part. Depending on the current 
module index the corresponding successor node is determined by the choice of 
the correct outgoing edge of the special node. Furthermore the module index has 
to be adapted. 





Fig. 1. BDD and CBDD for the same function. The false leaf is omitted. 



In Figure 1 an example for such a CBDD and its corresponding BDD is 
shown. It represents the reachable states of Milners scheduler. The BDD grows 
linearly while the number of CBDD-nodes remains constant and only the degree 
of the outgoing edges of the special nodes increases. 

The additional reduction when variables of different modules are represented 
by the same nodes is called second order reduction. 

When CBDDs are applied to systems which contain no similar modules a 
certain overhead can be caused by the introduction of the special nodes which 
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are not necessary when BDDs are used instead. Until a reduction is found special 
nodes are only needed to indicate that the module index has to be increased. 
This is implicitly clear when an edge leads from a /co-labeled node to a /ci-labeled 
node if k\ is smaller than /cq. A possible optimization is to omit those special 
nodes until a reduction is found and to introduce them only when they are 
needed as branching points. This could avoid most of the overhead in practice. 

4.3 Definition of CBDDs 

After the informal description of the data structure in the last two sections we 
will now give a more formal definition of CBDDs: 

Definition 3 (CBDD). A CBDD is a directed graph with three types of nodes: 
Leafs are labeled with true and false like in BDDs. 

Normal nodes are labeled with an index k e {0, . . . , m — 1}. They have exactly 
two outgoing edges which can lead either to the leaf false, to a normal node 
with an index greater than k or to a special node. 

Special nodes have an arbitrary number of successors. Each outgoing edge is 
labeled with a module index. A successor node of a special node is either the 
leaf true, a normal node, or another special node. The next special node on 
each path which starts with an i-labeled edge must possess an (i+l)-labeled 
edge. 

A CBDD must always be reduced according to the usual two BDD-reduction 
rules. 

The BDD reduction rules are applied during the construction of CBDDs just 
as for classical BDDs. So the number of normal nodes in a CBDD never exceeds 
the number of nodes for the corresponding BDD. On the other hand it is not 
necessary that all possible second order reductions are executed. A CBDD where 
all special nodes have exactly one outgoing edge is a valid CBDD. This offers 
the possibility to mix CBDDs where the second order reduction has already 
been applied with unreduced CBDDs. Note that CBDDs are always reduced 
with respect to the classical BDD reduction rules; they are so-called unreduced 
if there is no reduction between different levels. 



4.4 Manipulation of CBDDs 

The major change when CBDDs are used instead of BDDs is that a node with- 
out its corresponding module index is not sufficient to characterize a boolean 
function. So in practice there has to be made a strict distinction between a 
CBDD-node and a CBDD represented as a tuple {node, module index). 

Fortunately the algorithms for boolean operators, restriction, quantification 
and the relational product which have been proven to be very efficient for BDDs 
are not affected and remain unchanged. The only functions which have to be 
adapted are the following basic BDD-manipulation operations: 
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Extraction of a variable from a CBDD-node: It is necessary to know the 
module index to calculate the correct variable index. 

Successor determination: When the 0- or the 1-successor of a CBDD-node 
should be determined the case has to be respected where the direct succes- 
sor is a special node. Special nodes can never be the result of a successor 
operation. The correct normal node (or leaf) has to be chosen with respect 
to the module index. In combination with the increased module index this 
tuple is the result of the operation. 

Construction of new nodes: Nothing has to be changed if the new node and 
its successors belong to the same module. In case a successor node belongs 
to another module a special node has to be inserted. 

However, the most important change is the detection and realization of the 
additional reduction. How the possibilities for the second order reduction are 
determined efficiently and when this reduction is performed will be explained in 
detail in the following section. 

When the unique-table which contains all nodes is consulted to determine if 
a node already exists it has to be considered that the table can contain a node 
more than once. When the direct successor of a node is a special node it can have 
several normal nodes as “real” successors so it will be inserted into the unique- 
table with different hash values. This also becomes important when a garbage 
collection is applied. Nodes can only be deleted when they are unreachable from 
all modules. 

4.5 Determination of the Second Order Reduction 

In the last section the necessary modifications were discussed when CBDDs are 
used instead of BDDs. Those internal changes are only important inside a CBDD- 
library and do not affect the user. The only information which must be indicated 
by the user is the affiliation of the variables to the modules. This information 
could be determined by an intelligent parser and an appropriate input syntax 
but this has not yet been implemented. Moreover a direct manipulation of the 
module structure should remain possible for an experienced user to exhaust the 
potential of CBDDs. 

A very important point is that these informations concerning the module 
structure do not affect the correctness of the verification. When two modules 
turn out to be different the probability to find structural identities between the 
BDD-parts is very low but the verification result will still be valid. So the user 
only influences the application possibilities of the second order reduction but it 
is not necessary that any preconditions have to be proven for the use of CBDDs. 

To apply the second order reduction isomorphic BDD-parts, which corre- 
spond to Multi- Terminal-BDDs, have to be determined. To handle the addi- 
tional reduction, a new hash-table is introduced. It contains the root nodes of 
the maximal BDD-parts. The hash value depends on the structure of the whole 
BDD-part. When two entries have the same hash value it is tested whether their 
structure is identical. Supposing an appropriate choice of the hash function the 
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test if two BDD-parts with the same hash- value are structurally identical has to 
visit all nodes of the BDD-parts only when the test is successful. In the other 
case, a difference will in general appear after the comparison of very few nodes, 
which allows an efficient determination of the second order reduction. 

Typical efficient implementations of BDD-libraries do not allow the applica- 
tion of the second order reduction during the construction efficiently, because the 
reduced nodes could not be removed. As there can be pointers from other nodes 
and from the computed table to the reduced nodes they had to be kept until 
a garbage collection is performed or a traversal of the two hash-tables would 
be necessary. Therefore, we delayed the search for the reduction until a garbage 
collection takes place and perform all possible second order reductions in a re- 
duction phase. At that point, the hash- tables have to be traversed anyway and 
after the dead nodes have been marked it is checked whether structural identities 
allow a further reduction of the remaining nodes. 

In [21] it was shown that it is most efficient to perform a garbage collection 
as late as possible because nodes can be reallocated allowing the reuse of already 
performed computations. So another approach to integrate the reduction phase 
into the verification process is to replace a garbage collection by a reduction 
phase. If enough nodes can be reduced the execution of a garbage collection can 
be retarded because the same information can be represented with less nodes 
and there is enough memory left to continue the verification. 

4.6 Uniqueness 

One of the most important properties of BDDs is the uniqueness of the repre- 
sentation already proven by Bryant [5]. Two boolean functions are equivalent if 
and only if their BDD-representation is equal. As this test can be performed in 
constant time it is frequently used by the classical BDD-algorithms and allows 
the reuse of former results contributing strongly to the efficiency of BDDs. 

Unfortunately, CBDDs are not unique. Uniqueness is lost because of different 
possibilities for the second order reduction. This seems to influence the efficient 
use of CBDDs badly but this deficiency can be avoided. Similar as for BDDs, the 
equality-test for CBDDs still can be performed in constant time. In Section 4.4 it 
was explained that only the algorithms for the basic CBDD-manipulations have 
to be changed slightly while the boolean operations are not affected. Neither for 
an efficient CBDD-manipulation nor for an equivalence test in constant time the 
uniqueness of the underlying data- structure is a necessary condition. 

Definition 4 (weak uniqueness). A set S = {^(/)|/ : IB’^ ^ IB} of represen- 
tations ^(/) for boolean functions f is called weak unique if equivalent boolean 
functions are represented by the same representative: 

V/i,/2 : B” - B, (<l>(/i),<l>(/2) eS)^ (/i = /2 ^ <l>(/i) = <l>(/2)) 

Lemma 1. The set of CBDDs stored by a CBDD-library is weak unique. 
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Proof idea: The unique-table guarantees that each newly created node is not 
contained in the set of stored CBDD-nodes. CBDDs are constructed in a bottom- 
up style like BDDs. Therefore an induction on CBDD-size is possible. The fact 
that the successor nodes exist only once is sufficient to guarantee in combination 
with the use of the unique-table that a set of CBDDs is weak unique. □ 

Note that the definition of weak uniqueness is different from uniqueness. 
When a unique representation is used its structure is fixed from the moment 
when the semantics of a concrete function is defined. If a weak unique structure is 
used during the construction process of the representation for a concrete function 
there can be several points where an indeterministic choice has to be made. The 
important condition to guarantee the weakened form of uniqueness is that once 
this choice has been made it can be guaranteed that arriving at that point again 
the same choice will be taken. 



5 Experiments 

We evaluated our approach on several examples. In [14], a scalable circuit for 
a division is given. For a dividend of n bits and a divisor of d bits it consists 
of {n — d) ^ d cells of two different types. The verification can be divided into 
two steps. First, an intermediate result has to be determined which contains 
variables for each cell. These variables are eliminated with a quantification but 
as the intermediate result is usually bigger than the final BDD, it is important 
also to reduce this intermediate result. When the bit length of the divisor is fix, 
the number of CBDD-nodes increases up to a certain bit length of the dividend. 
Afterwards, the number of CBDD-nodes remains constant, only the number of 
successors of the special nodes increases linearly. In the corresponding BDD, the 
number of nodes grows linearly. 

As a model checking example we used the tree-arbiter [10]. It is a distributed 
hardware solution for a mutual exclusion mechanism. There are 2n users who 
can request the use of one resource and the tree-arbiter eventually returns an 
acknowledge. It has to be assured that no two users are granted simultaneous 
access to the resource. The tree- arbiter is constructed out of 2n — 1 cells of the 
same type which form a pyramid structure where each internal cell communicates 
with three other cells, so there exists no “good” variable ordering which would be 
necessary for a linear BDD representation. We experimented with tree-arbiters 
with 5 to 21 modules. The transition relation ranges from 4200 to 578000 BDD- 
nodes which could be reduced to 48% - 75% if CBDDs are used. During the 
iteration process which calculates the set of reachable states the necessary BDDs 
can be reduced to 75% on average. Note that the best percentage of reduction 
was obtained for large systems while for arbiters with few modules only a small 
amount of nodes could be reduced. A similar system is the DME (distributed 
mutual exclusion) where the modules are organized as a ring structure. There we 
observed the effect that for the transition relation, the number of BDD-nodes 
increases linearly while from a certain system size on the number of CBDD- 
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nodes remains constant and only the number of successors of the special nodes 
increases. 

An example for protocol verification is the PCI-bus protocol [20]. We exper- 
imented with a scenario with up to 6 slots and two modes of policy: round robin 
and fixed priority. Each of the slots can be empty or equipped with a “user” 
which eventually requests the bus. So the system is scalable depending on the 
number of non-empty slots. In addition, an arbitration module with a different 
structure is needed. We observe that an intermediate result is bigger than the 
transition relation itself (factor 2 to 8). The BDD representing the reachable 
states has only very few nodes in comparison to the transition relation. For a 
scenario where all slots are empty, a very good reduction can be obtained be- 
cause the transition relations for all slots are equal. For two filled slots the least 
reduction is found, only 18% can be reduced for a fixed priority policy (12% 
for round robin). Then the percentage of reduction increases with the number 
of filled slots. For the round robin policy and 5 equipped slots the BDD of the 
intermediate result could be reduced from 1.300.000 nodes to 920.000 and for 
the fixed priority policy with 6 slots only 387.000 CBDD-nodes are necessary 
instead of 776.000 BDD-nodes corresponding to a reduction of more than 50%. 

Big transition relations can also be coped using a partitioned transition re- 
lation. In this case only the relations for the single transitions have to be de- 
termined each represented by a different BDD because they depend on different 
variables. In this case CBDDs can also help to save memory because one CBDD 
can be used to represent several transition relations of single modules which are 
structurally identical for equal modules. 

The experiments show that the best reduction is obtained for big systems 
which is very desirable because the application field for this method are systems 
which cannot be represented within the available main memory. Regarding the 
example of the PCI-Bus, the BDD for the representation of the reachable states 
is very small but before it can be constructed the BDD for the transition relation 
has to be determined. It is much bigger and must be held in main memory for the 
whole calculation of the reachable states. A reduction of this BDD which allows 
to perform the rest of the calculation without exceeding the space limitations 
can yield an important gain of time. 

In all experiments the additional time necessary for the determination and 
realization of the second order reduction did never exceed 15% of the time needed 
for the whole verification. 



6 Conclusion 

We presented an alternative technique to represent boolean functions using a 
BDD-variation. It was shown that the characterization of a boolean function with 
its root node is not necessarily the best choice. The introduced enhancements of 
the classical BDD structure allow the economization of nodes while the efficiency 
of BDD-algorithms is not affected. 
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Depending on the application and its modules, structural identities appear 
in the BDD which can be exploited by the approach. This leads to a more 
space efficient system representation increasing the degree of reduction: nodes 
are “used” from different levels of the CBDD without the obligation to prove 
any properties by manual interference. The user only has to identify the modules 
which occur several times or probably have the same structure. The second 
order reduction is found automatically in case identical BDD-parts exist. As 
the method only tries to reduce the main memory requirements it causes an 
overhead in time which is necessary to determine the additional reduction. So 
its advantages are mainly important for the verification of big systems which 
cannot be verified within the main memory limitations. In this case the reduction 
of space quickly turns into a reduction of time. Thus, the use of the second level 
memory can be avoided which is usually a magnitude slower. 

Sharing of BDD-parts in one layer would require the possibility to remove 
former reductions. As this seems not to be feasible efficiently the reduction is 
limited to the reuse of BDD-parts in different layers. Therefore the maximal 
reduction is linear in the number of modules. This cannot avoid an exponential 
blow-up but the memory consumption can be reduced. 

It is intended to evaluate the efficiency of the approach on further examples 
and to examine the applicability of it to other BDD-variations. Also, the com- 
bination of this method with popular BDD-optimizations like dynamic variable 
reordering [18] will be investigated. 
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Abstract. Formal verification tools must often cope with large memory 
sizes and indirect addressing. This paper presents a new approach of 
how to handle memory operations in the symbolic simulation of designs 
with complex control logic, e.g., processors. The simulator is currently 
used to check the equivalence of two processor descriptions with distinct 
order of memory operations. During symbolic simulation, relationships 
between memory operations are automatically detected while addresses 
and the memory states are given symbolically to summarize many test- 
vectors. The integration of the technique in the equivalence checker is 
demonstrated by example designs. 



1 Introduction 

Formal verification of designs with complex control has to consider memories 
which have large sizes and are addressed indirectly. In [13] we presented a new 
approach for the automatic equivalence checking of designs with complex control. 
Our method is able to cope with different numbers of control steps and differ- 
ent implement at ional details in the descriptions to be compared. The verifica- 
tion tool combines symbolic simulation with a hierarchy of equivalence checking 
methods, including decision-diagram based techniques. A complete verification is 
possible in contrast to “classicaf’ simulation since symbolic values are used. One 
symbolically simulated path corresponds in general to a large number of “clas- 
sical” simulation runs. During the symbolic simulation, relationships between 
symbolic terms, e.g., the equivalence of two terms are detected and recorded. 
The equivalence detection for memory operations, which has to consider sym- 
bolic address relationships, is described in this paper. Addresses are compared 
using only the information of equivalence classes established previously. This 
makes a fast equivalence detection possible, which can cope with complex re- 
orderings of memory operations. 

Currently the symbolic simulator is used to check the computational equivalence 
of two descriptions of complex control logic. Two descriptions are computation- 
ally equivalent if both produce the same final values at definite time steps on 
the same initial values relative to a set of relevant variables. For instance, the 
two descriptions in Fig. 1 are computationally equivalent with respect to the 
final value of the relevant variable z. Parenthesis enclose synchronous parallel 
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Specification 

rf [adrA] ^a; 
rf [adrB] ; 
mem [adr 1] ^val ; 
x^mem[adr2] ; 
z^x+rf [adrR] ; 



Implementation 

(rf [adrB] , 
x^mem [adr2] ) ; 

(if adrA adrB 

then rf [adrA] ^a endif , 
mem [adr 1] ^val) ; 

(if adrl=adr2 

then z^val+rf [adrR] 
else z^x+rf [adrR] endif); 

Fig. 1. Forwarding example 



transfers in Fig. 1. The sequential composition operator separates consecutive 
transfers. There are two examples for a reordering of memory operations in Fig. 
1. Firstly, the order of the read- and the store-operation to mem is reversed 
in the implementation. Thus, val is forwarded if the addresses are identical, 
otherwise the value assigned to x is used. This is a typical forwarding example 
occurring in pipelined systems. Secondly, the order of the store-operations to 
the register file rf is reversed. This can, for example, happen during synthesis 
of architectures using data memory mapping, i.e., some single registers can be 
addressed by instructions in the same manner as registers of the register file. 
This is common for many microcontrollers, e.g., PIC, 8051, etc. Synthesis can 
change the order of accesses to this “common” data memory, e.g., by introducing 
pipelining. Formal verification has to consider the access to registers and regis- 
ter file by a single memory model. Otherwise it may remain unrevealed that, for 
example, the program counter is erroneously overwritten by an instruction due 
to a lacking address comparison. 

Fig. 2 gives a simplified overview of the symbolic simulation algorithm which 
has been implemented iteratively for optimization [13]. The equivalence checker 
simulates symbolically all possible paths. Each symbolically executed assignment 
establishes an equivalence between the destination variable on the left and the 
term on the right side of an assignment. Additional equivalences between terms 
are detected during simulation. Equivalent terms are collected in equivalence 
classes. Ealse paths are avoided by making only consistent decisions at branches 
in the description. When simulation reaches a condition C that cannot be de- 
cided in general but depends on the initial register and memory values (line 2, 
e.g., adrl=adr2 in Eig. 1), a case split is performed (line 3). Note that equiv- 
alence_check is only called recursively with those parts of spec and impl that 
are not simulated yet. 

A complete path is found when the end of both descriptions is reached. The first 
check of the final values of the registers (line 4) may lead to a false negative, 
since equivalence detection during the path search is not complete. Only relation- 
ships between terms that are fast to detect or are often crucial to demonstrate 
computational equivalence are considered on the fly. Therefore, more accurate 
equivalence checking methods, including decision diagram based techniques, are 
used to verify if computational equivalence is given in this path but was only 
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Algorithm equivalence .check 
INPUT spec, impl] 

1. Simulate spec and impl in parallel UNTIL 

(a) a condition C is reached that cannot be decided in general but depends on 
the initial register and memory values or 

(b) the end of both descriptions is reached. 

2. IF a condition C blocks THEN 

3 {equivalence -check {spec, impl) \c= false) A 

{equivalence -check {spec, impl) \ c=true) 

4. ELSEIF final values of registers are equivalent THEN 

5. RETURN (TRUE) ; 

6. ELSE 

7. Use more accurate equivalence checks; 

g (final values of registers are equivalent) V thEN 

(a condition has been decided inconsistently) 

9. RETURN (TRUE) ; 

10. ELSE 

11. RETURN (FALSE) ; 

Fig. 2. Simplified algorithm of the symbolic simulation 

not detected on the fly or a condition has been decided inconsistently (line 8), 
i.e., a false path is reached. Otherwise the counterexample with relevant details 
about the simulation run for debugging is reported. Our automatic verification 
process does not require insight of the designer into the verification process. 

[13] gives a detailed description of the symbolic simulation, the decision process 
and the decision diagram based techniques. The efficiency of our symbolic sim- 
ulation is demonstrated and compared with other approaches. The focus of this 
paper is the equivalence detection for memory operations, which must consider 
large memory sizes and indirect addressing. 

Some related work is reviewed in section 2. The internal data structure, the 
memory model and some terminology are given in section 3. Section 4 overviews 
the equivalence detection for memory operations. A more detailed description 
is given for read-operations in section 5 and for store-operations in section 6. 
Experimental results are presented in section 7. Finally, section 8 gives a con- 
clusion. 

2 Related Work 

Various representations for memory operations have been proposed for formal 
verification of designs with complex control. States are represented by decision 
diagrams in [4] for traversing an automata for model checking and in [3] for equiv- 
alence checking. This permits the representation of a register file, but not, e.g., 
of a large data memory due to the sensitivity to graph explosion. [14] uses deci- 
sion diagrams combined with an encoding technique to represent uninterpreted 
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symbols. Memories and functional units (e.g., the ALU) are abstracted by mem- 
ory models. OBDD’s represent the addresses of those models and are used for 
address comparison. The approach has the disadvantage that the complexity of 
the simulation increases exponentially if the memory models are addressed by 
data of other memory models. This is for example the case when a data memory 
is addressed by an ALU and writes to the register file. Note that the ALU has 
to be represented by a memory model. Therefore, the processor they verified in 
[14] contains neither a data-memory nor branching. 

SVC (the Stanford Validity Checker) [1, 2, 11] is a proof tool for automatic 
verification of formulas which can contain the two array operations read and 
write to model memory operations. Verification of control logic is possible us- 
ing SVC if the verification task can be reduced to a formula which is suffi- 
cient to demonstrate the verification goal. For instance, [5] proposed an ap- 
proach to generate such a formula for the verification of a pipelined system 
against its sequential specification. Formal verification of control logic with 
SVC compared to our symbolic simulation method is discussed in [13]. Rela- 
tionships of memory operations are revealed by SVC basically by case analysis. 
A read-operation read (write (s, aVF, 'c), al) after a write-operation is rewrit- 
ten to ite(al = aVF, 'c, read(s, al)). A case analysis is required to prove that 
read (write ( 5 , aVF, 'c), al) = read (write (s, aVF, 'c), a2) follows from al = a2. 
Our approach avoids case analysis on memory operations. Equivalences of mem- 
ory operations as the example above are detected in a different manner during 
the simulation. Furthermore, rewriting and case analysis can become also not 
practicable in a formula prover if the memory operations cause to much case 
splittings. This is, for example, the case, if operands are read repeatedly from 
a memory and the result is written back. Consider a simple architecture, where 
an instruction with two source- and one destination-address is read from an 
instruction memory. The source values are read from data memory, are added 
and the result is written back. Finally the program counter is incremented and 
the next instruction is fetched. Equivalence checking of the data memory after, 
e.g., six instructions requires already 11,868,920 case splits using SVC (4,396s 
on a 300 MHz Sun Ultra II), if we reverse the order of the first two instructions 
addressing distinct places in the data memory. Our approach avoids these case 
splits. 

[12] uses ACL2 to simulate symbolically executable formal specification with- 
out requiring expert interaction. Related is the work in [7], where pre-specified 
microcode sequences of the JEMl microprocessor are simulated symbolically us- 
ing PVS. [12] models memories as lists of symbolic values which represent the 
memory contents, i.e., the length of the lists grows with the memory size. This 
explicit modeling allows no symbolic values in indirect addressing, since e.g., a 
store-operation with an symbolic address can change any memory place. But 
the intention of [12] is completely different. The tool provides a fast simulation 
on some indeterminate data for debugging a specification, i.e., the instruction 
sequence at the machine level is fixed. Our tool copes not only with some inde- 
terminate data but verifies every possible control flow. 
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3 Preliminaries 

Our equivalence checker compares two acyclic descriptions at the rt-level. For 
many cyclic designs, e.g., pipelined machines the verification problem can also 
be reduced to the equivalence check of acyclic sequences [13]. 

The memory model used by the symbolic simulator assumes an infinite size of 
each memory in the descriptions. Similar to [5, 1], two array operations are 
used to model memory access: read (mem, adr) returns the value stored at the 
address adr of memory mem. The operation store (mem, adr ,val) returns the 
whole memory state of mem after changing the memory state only at adr to val. 
The two operations are used for all accesses to arrays of a description that can 
be addressed indirectly. This includes not only, e.g., the data- memory of a pro- 
cessor but also an indirectly addressed register file. On the other hand, directly 
addressed memories, i.e., cases where the addresses are constants, need not to 
be modeled by the read/store-scheme. A directly addressed memory place can 
also be considered as a register which is practically done by replacing all memory 
operations by a new distinct register name (e.g., mem[3]^x becomes memS^x). 
The inherent timing structure of the initial description is expressed explicitly by 
indexing the register and memory names. An indexed register name or memory 
name is called a Reg Val. A new Reg Val with an incremented index is introduced 
after each assignment and after each store-operation to a memory. An additional 
upper index 5 or z distinguishes the Reg Vais of specification and implementation. 
For example, ar^a+b; is replaced by ar| ^af+bf ; in the specification if all reg- 
isters have been already assigned once. The third store-operation to a memory 
mem [adr] ^val ; becomes memg ^ store(mem 2 , adrf , valf ). The RegVals mem^ 
and memf represent the memory state before and after the store-operation. 
Only the initial register/memory names as anchors are identical in specifica- 
tion and implementation, since the equivalence of the two descriptions is tested 
with regard to arbitrary but identical initial register values and memory states. 
Checking computational equivalence consists of verifying that the RegVals of the 
register or memories with the highest index are always equivalent. 

Our symbolic simulation method has to detect equivalent terms. 

Definition 1 (Equivalence of terms). Two terms or RegVals are equivalent 
=term, Under the decisions Co,...,Cn taken preliminary on the path, their 
values are identical for all initial RegVals. The operator | denotes that each 
case-split, leading to one of the decisions Co, ..., Cn, constrains the set of possible 
initial RegVals. 

termi =term term2 ^ RegV aknitiai • {termi = term2) [ (Co A Ci... A Cn) 

Equivalent terms are detected along valid paths, and collected in equivalence 
classes. We write termi =sim term2 if two terms are in the same equivalence 
class established during simulation. If termi =sim term2 then termi =term 
term2. Note that RegVals describe also different memory states, e.g., memf =sim 
mem 2 indicates that the two memory states are equivalent. Initially, each RegVal 
and each term gets its own equivalence class. Equivalence classes are unified after 
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every assignment, if two terms are identified to be equivalent and when deciding 
the condition of an if-then- els e-cldiuse in a case split. Equivalence classes permit 
to keep also track about unequivalences of terms: 

Definition 2 (Unequivalence of terms). Two terms or Reg Vais are unequiv- 
alent ^term, if Under the decisions Co, Cn taken preliminary on the path their 
values are never identical for arbitrary initial Reg Vais: 

termi ^term term2 o ~^ 3 RegVal initial ■ {termi = term2) I (Cq A Ci... A C„) 

We write termi ^sim ter m 2 if two terms are identified to be ^term during 
simulation. Equivalence classes containing ^sim terms are unequivalent, i.e., 
they contain different constants or terms, that have been decided in a case split 
to be unequivalent. The information of the equivalence classes is required to 
decide conditions consistently each time a new if-then-else is reached, i.e., to 
avoid false paths. 

4 Equivalences of Memory-Operations 

Three types of equivalences have to be detected concerning memory-operations: 

• Value stored by a store is equivalent to a read Section 5.1, 5.3 

A read-operation reads always a value previously stored by an unique store- 
operation. Note that the read-operation occurs after the store-operation 
during simulation, i.e., this equivalence is only checked for read-operations. 

• Equivalence of two read-operations Section 5.2, 5.3 

Two read-operations are equivalent since they always yield the same value. 

• Equivalence of two store-operations Section 6 

The resulting memory states are equivalent, i.e., the contents of the mem- 
ories after the two store-operations in specification and implementation, 
respectively, are always identical. Often, the memory states before the store- 
operations are also equivalent, which is fast to check. The stores can also 
result in identical memory states in the opposite case for two reasons: 

- a store-operation is overwritten by succeeding stores, see section 6.2. 

- the order of store-operations to the memory is different in specification 
and implementation, see section 6.3. 

Our equivalence detection is hierarchical: first an identical store-order in 
both descriptions is assumed, i.e., the memory states are pairwise identi- 
cal. Then possibly overwritten stores are considered. Only if still a store- 
operation has no equivalent counterpart in the other description and a fast 
pre-check is satisfied, the more time consuming technique presented in sec- 
tion 6.3 is used to detect a changed order of store-operations. 

Eor many operations the equivalence of terms can be decided by simply testing 
if the arguments are =sim or ^sim which avoids the expansion of the arguments. 
This is also consequently used for the equivalence detection of read- and store- 
operations. Only the information of the equivalence classes of the addresses is 
used, i.e., our address comparison checks if two addresses adrl and adr2 are 
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1. in the same equivalence class, i.e., adrl =sim ddr2 

2. are in unequivalent equivalence classes adrl ^sim adr2^ or 

3. if the equivalence depends on the initial register or memory values. 

Expansion of arguments as in [14], where boolean expressions are evaluated, is 
avoided. Note that the equivalence detection first considers the arguments of a 
term, i.e., the subterms. 

Due to the space limitations and to clarify the problems, we use the following 
abbreviations in the examples of the next sections: 

• Only the relevant read- and store-operations and address relations are 
shown. The in general complex control structure (e.g., if-then- else clauses) 
and all assignments to registers which do not include a read-operation are 
omitted. We, therefore, consider always only one path of the symbolic simu- 
lation. Note that our equivalence detection for memory operations does not 
require additional case splits. 

• It is assumed that equivalences/unequivalences of the addresses have already 
been determined by other equivalence detection techniques [13] or are caused 
by case splits at preceding conditions of if-then- els e-cldiuses which are omit- 
ted, see above. 

• Addresses or values with identical name in specification and implementa- 

tion, i.e., without the upper index 5 or z stand for arbitrary terms, which 
are assumed to be detected previously to be =sim- Using adrl can signify 
textually different terms in both descriptions, e.g., adrl^ = a| + b| and 
adrV = + a 2 , which are equivalent if = c\ and a| = a^. 

• The boxes below the examples indicate, which additional relationships of the 
addresses must hold for two terms or memory states to be equivalent. 

5 Detecting Equivalences of Read- Operations 

5.1 Reading a Previously Stored Value 

If the address of a read-operation reading from a memory and the address of the 
last store-operation referring to this memory are =simj then the value stored 
by this store-operation is always read. Fig. 3 (a) gives an example. The memory 
state memi resulting from the last store-operation is in this case the same as the 
first argument of the read-operation. 

If there is another intervening store-operation as in Fig. 3 (b), this relationship 
does not hold, since the second store-operation can overwrite the value stored 
by the first. But if the address of the read-operation is ^sim to the address 
of the second store, its value is never read by this read-operation. For the 
read it seems as if the last store did not happen. In general, all preceding 
stores of a read with unequivalent addresses have to be ignored. This is done 
by calculating the read access of a read-operation, i.e., the relevant memory 
state. The addresses of all store-operations in-between this memory state and 
the read-operation are unequivalent to the address of the read. The store 




(a) memi ^ store (mem, va/l) ; (b) 

regi ^ read(memi, ; 

adrl =sim adrR 

reg. =sim vail 
J 
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memi ^ store (mem, a<irl, va/1) ; 
mem 2 ^ store (memi , a<ir 2, va/2) ; 
regi ^ read (mem 2 , ; 

{adr2 ^sim adrR) A (a<irl =sim adrR) 
reg-, =sim vail 

\ ^ y 



Fig. 3. Reading previously stored values 

previous to the read access has an address that is not unequivalent and its value 
might be read. If the address of this store is even =sim^ Ihe stored value is 
always read and, therefore, =sim to the read-operation. 

5.2 Equivalence of Read-Operations 

Two read-operations from specification and implementation are equivalent if 
their addresses and their read accesses are equivalent. The equivalence of the 

Specificiation Implementation 

memf ^ store (mem, adrl, ua/1) ; mem^ ^ store (mem, adrX, r a/ V) ; 

regi ^ read(memi, adrR) ; mem^ ^ store (mem^ , adrl, r a/ 1) ; 

regi ^ read(mem 2 , adrR) ; 

adrR ^sim adrX reg^ =sim reg} 

\ J 

adrl is assumed to be neither ^sim nor =sim to adrR 
Fig. 4. Equivalence of two read-operations 

read accesses guarantees, that all locations of the memory where they might 
read from (depending on the concrete value of the symbolic address) are iden- 
tical. This procedure fails in the example of Fig. 4 if adrl is neither ^sim nor 
to adrR. The first store in the implementation is not relevant for the read- 
operation, if its address adrX is unequivalent to adrR. But the read accesses of 
the two read-operations are not identical because of the blocking second store 
with adrl. Note that if adrl ^sim adrR holds, the read accesses would be both 
mem and if adrl =sim adrR holds, vail would be read in both cases. 

A supplementary check for two read-operations with equivalent addresses is 
provided to cope with mismatching read accesses. If stored value and address 
of the “blocking” store-operations are equivalent, the read access is calculated 
again for both read-operations without these stores. This process can be re- 
peated until either equivalent read accesses are found, i.e., the read-operations 
are equivalent, or store-operations block which addresses or values stored are 
not equivalent. Note that the memory states of the blocking store-operations 
do not need to be equivalent, see the example in Fig. 4. 



5.3 Re-checking Read- Operations 

Our general equivalence detection considers that the equivalence of the argu- 
ments of two terms is in most of the cases already obvious, when the second 
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term is found on the path. Therefore, it is sufficient to check only at the first 
occurrence of a term whether it is equivalent to some previously found term. 
When finding a read-operation the first time, not all equivalences and unequiva- 
lences relevant for equivalence detection may be already stated, see for instance 
the forwarding on x in Fig. 1. The equivalence of the read-operation in the 
specification and in the else-paih of the implementation is only obvious after 
the case split setting adrl ^sim ddr2. This case is common in processor design 
with pipelining. A value is read speculatively and used only if there is no data 
conflict. Otherwise the relevant value is forwarded. The example indicates, that 
it is important to check read-operations whenever the equivalence classes of 
the corresponding addresses are modified. Therefore, the read-operations found 
during the symbolic simulation are marked at the equivalence classes of their ad- 
dresses as dependent read-operations. If there is a change of an equivalence class, 
either because it is unified or set unequivalent to another equivalence class, all 
dependent read-operations are checked again. In the example of Fig. 1, the read- 
operation in the specification is marked at the equivalence class of adr2. When 
setting the equivalence classes of adrl and adr2 unequivalent, the equivalence 
of the read-operations is detected. 

6 Detecting Equivalent Memory States 

Detecting the equivalence of two memory states is necessary to show compu- 
tational equivalence but also required to be able to argue about the equiva- 
lence of two read-operations in specification and implementation. Since a store- 
operation returns the whole new memory state, finding equivalent memory states 
is the same as detecting equivalent store-operations. 

6.1 Identical Order of Store-Operations 

In some designs, the order of store-operations is identical in the two descriptions 
to compare. A sufficient, but not necessary condition for the equivalence of two 
store-operations and, therefore, the resulting memory states, is that addresses, 
the values stored, and the previous memory states are pairwise in the same 
equivalence class, which is fast to test and, therefore, checked first when finding 
a new store-operation. The final value of a memory in implementation and 
specification depends on the last two stores on both sides, which use the result 
of the previous stores as first arguments. By means of an inductive argument, 
when building a list in order of appearance of the stores in implementation and 
specification, every store may have its “partner” on the other side, if the order 
of store-operations is identical. The first store-operations on both sides have 
the initial memory state as first argument, which is identical for both sides. 
Specification and implementation can have also only partially identical orders 
of stores, which begin from two equivalent memory states. The latter may be 
either the initial memory state or memory states that have been identified to be 
equivalent by one of the techniques described in section 6.2 and 6.3. The partially 
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identical store-order ends before the first store-operation-pair, where either 
address or stored value are not equivalent. The order of store-operations has to 



store (dmem, adrl , vail) 
store (rf , adr2 , val2) 
store (dmem, adr3 , val3) 
store (rf , adr4 , val4) 



have the same store order 
for both rf and dmem 



store (dmem, adrl , vail) 
St ore ( dmem , adr 3 , val3 ) 
store (rf , adr2 , val2) 
store (rf , adr4 , val4) 



Fig. 5. Identical store-orders 

be the same in specification and implementation only with regard to the same 
specific memory. The interleaving of store-operations to different memories can 
be arbitrary, see for example Fig. 5. 



6.2 Overwritten Store-Operations 

An identical store-order as defined in the previous section requires an equal 
number of store-operations, which is not given in the example of Fig. 6. Nev- 
ertheless, the final memory states are identical if the value stored by the sec- 
ond store-operation of the implementation is always overwritten by the third 
store-operation, i.e., if the addresses are =sim- This situation can occur for in- 





Specificiation 




Implementation 


memi ^ 


- store (mem, adrl, V a/ 1) ; 


mem^ - 


^ store (mem, adrl, ra/1) ; 


memi ^ 


- store (memf , adr2, val2 ) ; 


mem2 ' 
mem3 - 


^ store (memi , adr A, valX) 
^ store (mem 2 , adr2, val2 ) ; 




l^adrX =sim adr2 


mem2 = 


■sim mem 3 ^ 



Fig. 6. Example for an overwritten store-operation 



stance if the second store is speculative, but the speculation fails and the third 
store is used to correct the fault. Let’s assume that val2 and valX are not 
=sim- Therefore, mem^ and mem^ cannot be in the same equivalence class and the 
equivalence detection of section 6.1 will fail. But for the last store-operation 
in the implementation there is no difference if the previous memory state is 
mem^ or mem^. Therefore the relevant preceding memory state is calculated for 
equivalence checking. This is either the memory state after the first preceding 
store-operation, which is not overwritten by the new store-operation or the 
initial memory state. Two store-operations in specification and implementa- 
tion are equivalent if addresses, stored values and the relevant preceding mem- 
ory state are =sim- This criterion copes with different number of overwritten 
store-operations in specification and implementation. Determining the relevant 
preceding memory state is fast, since, again, only the information of the equiva- 
lence classes is used. Furthermore, its calculation is only necessary if there exists 
a potential “counterpart” with =sim address and stored value. 

Note that by considering overwritten stores, there are some special cases where 
more than two store-operations, one of the specification and one of the imple- 
mentation respectively, are in a single equivalence class. For instance, the mem- 
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ory states after the second and the third store in the implementation in Fig. 6 
are identical if adr2 =sim adrX and val2 =sim valX holds. 



6.3 Changed Order of Store-Operations 

If the store-order is changed as in the example of Fig. 1 for rf and Fig. 7 for 
mem, the final memory states can be equivalent, if the addresses of the store- 
operations are ^sim- A correct reordering of store-operations can be the result, 
for example, of synthesizing designs with data mapping, see section 1. 

When a new store-operation is found and all previous checks fails, there might 
exist a store in the other description with equivalent address and stored value, 
which is the “counterpart” in a changed store order. Since the new store is 
the most recent in its description, there must be some store-operations before 
it, which happen after the “counterpart” in the other description. Assume that 

Specificiation Implementation 
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- store (mem, adrA, • • •) 5 
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meml ^ 


— store (mem, adrA , . . . ) ; 


01^ 


mem| ^ 
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D" 


mem2 ^ 


— store (mem^ , adrD , . . . ) ; 


B^ 


mem| ^ 


- store (mem|, adrB^ • • •) 5 




mem3 ^ 


— store (mem 2 , adrC , . . . ) ; 


02^ 


mem| ^ 


- store ( overwritten later ) ; 


03^ 


mem4 ^ 


- store (overwritten later) 




memf ^ 


- store (mem| , adrC , . . . ) ; 


B" 


mem5 ^ 


— store (mem\, adrB , . . .) ; 




meniQ ^ 


- store (mem 5 , adrD , . . .) ; 









{adrD ^sim adrC) A {adrD ^sim adrB) A {adrB ^sim adrC) memg =sim mem^ 



Fig. 7. Changed order of store-operations 

the new store is and the “counterpart” in Fig. 7. The stores B and C are 
before D in the specification but after D in the implementation. The stores 01, 
02, 03 are overwritten by succeeding store-operations, i.e., B^, C^, or Bb A 
valid reordering of the store-operations requires that the addresses of D on the 
one hand and B, C on the other hand are ^sim- But we do not know that only 
B and C have to be checked, since there might be some overwritten stores 01^, 
02^ or 03^ in-between or before B or C (see Fig. 7). For a quick test, first two 
sets containing all memory states previous to D^/D^ are determined, where all 
store-operations after those memory states and before D^/D^ have a determined 
address relationship; i.e., the addresses of those store-operations must be either 
^sim to the address of D^/D^ or =sim to the address of one of the succeeding 
store-operations. A changed store-order is only checked, if there are equivalent 
memory states in those two sets calculated for and Db In the following this is 
called that and have a eommon aecess state. 

The next step is to determine the two sequences Si and <^2 containing the same 
store-operations appearing in the two descriptions in changed order. This is not 
obvious since only the end of Si and the beginning of S 2 are known. Furthermore, 
overwritten stores have to be considered correctly, i.e.. Si = {B^,C^,D^} and 
<^2 = {D^, B^} in Fig. 7. We assume in the following that all store-operations 

of the changed store-order have already appeared first in the implementation 
(A^ to B^) and now the last store of the opposite sequence store^^^ is detected 
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during the simulation, i.e., D^. This is the first time where again equivalent 
memory states can be reached. Since is the newest store detected during the 
simulation, the algorithm assumes that this is the last element missing and that 
it is the end of <Si. Tracing back from this point, the first (previous) memory 
state is searched, which has an equivalent counterpart in the other description, 
i.e., memf and mem\ in Fig. 7. All preceding stores do not have to be considered 
since they lead to an equivalent memory state in implementation and specifica- 
tion. The store-operations in the two descriptions directly after this equivalent 
memory state store^^^^^ (01^) and store^l^^^ (D^) are the beginnings of the Si 
and <^2 before eliminating overwritten store-operations. 

Overwritten stores can be removed easily in Si since the latest storef^^ (D^) 
is known. Tracing back from store^^^ (D^) to storef^^^^ (01^), all store- 
operations with an address which is =sim to the address of a succeeding store 
are eliminated, which results Si = {B^,C^,D^}. 

The end store^^^ of the sequence <^2 is unknown, which makes eliminating over- 
written store-operations harder. Symbolic simulation may have already reached 
some store-operation after which overwrites for instance but has to be ig- 
nored to find <^2 correctly. All store-operations after the unknown end storef^^ 
(B^) do not have to be considered when eliminating overwritten stores in <^ 2 . S 2 
is determined by beginning with store^l^^^ and adding successively succeeding 
stores. Every time a new store is added, eventually overwritten stores are 
eliminated. This process is stopped, when the number of store-operations in <^2 
is the same as in Si. 

Finally it is controlled, if every store-operation in <Si has its partner in <^2 with 
=sim address, =sim stored value and common access state (see above). In this 
case, the memory states after store^^^ and store^^^, i.e., and B^ are equiva- 
lent. Note that the technique described in this section is not limited with respect 
to the length of the changed store order, which is three in our example. 

The handling of some exceptional situations is omitted in this paper due to the 
space limitations. Consider for example that a store succeeds directly B\ 
which overwrites with exactly the same value as Cb is then not only equiv- 
alent to B^ but also to Eb This is detected by building two sequences S 2 a and 
S 2 b with B^ and E^ as last elements in this special case. 

7 Experimental Results 

Results of the verification of four designs with increasing complexity concerning 
the equivalence detection of read/ store- operations are given in Tab. 1. In all ex- 
amples, a sequential specification is compared with the corresponding pipelined 
implementation. The specifications reflect a subset of the instructions of the 
respective architecture, i.e., the Alpha-architecture from Digital [6], the DLX- 
architecture [8], and the PIC16C5X-processor from Microchip [10]. The imple- 
mentations have been generated automatically from the specifications using the 
synthesis tool described in [9]. Verification of the pipelined designs was done 
using the flushing approach of [5], see also [13]. 
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Tab. 1 gives the verification time, the number of instruction classes and the to- 
tal number of paths checked during the symbolic simulation. The sixth column 
shows in how many paths stores are overwritten (section 6.2). The number of 
paths with changed order of the store-operations (section 6.3) is given in the 
last column. Paths with changed store-order are not considered in the sixth 
column although in these paths stores may be overwritten, too. 

The store-order in the DLX-example is always identical in specification and 
implementation and no overwritten stores have to be considered. We obtained 
the same result for the verification of two structural DLX-descriptions with 5 
pipeline-stages (see [13] for details). The Alpha-example requires additionally 
the consideration of overwritten store-operations. Consider two stores to the 
register file of the Alpha with equivalent addresses, which are executed con- 
secutively in the sequential description. One of them is skipped if they are in 
different instruction stages which are parallelized by the synthesis tool. Note that 
the register file of the DLX (respectively the data memory) is always written in 
the same instruction stage. 

All techniques presented in the previous sections are required to verify the two 
PIC-examples. The store-order was changed significantly in many paths after 
introducing pipelining. The reason is the data memory mapping used by this 



Description 


Pipeline 

stages 


Instruction 

classes 


Verification 

time 


Total 

paths 


Paths with stores 


overwritten 


changed order 


DLX 


5 


5 


561. 1 s 


225687 


- 


- 


Alpha 


3 


10 


7.84 s 


2374 


88 


- 


PIC I 


2 


17 


252.6 s 


107655 


3151 


1741 


PIC 2 


2 


17 


379.6 s 


I6I622 


4338 


5252 



Table 1. Experimental results. Measurements on a Sun Ultra II (300 MHz) 



architecture, i.e., single registers are addressed in the same manner as registers of 
the register file which makes synthesis (and verification) more complicated since 
numerous additional data-confiicts have to be resolved. This is also demonstrated 
by the higher complexity of PIC 2 compared to PIC I. The only difference of PIC 
I is that the STATUS-register is excluded from data mapping. Another reason for 
the complexity of the PIC-examples compared to the Alpha- and DLX-example 
(which have more pipeline stages) is the larger number of instruction classes. 
Verification with the symbolic simulator revealed a rare bug in the implementa- 
tion of the synthesis tool. The decrement-instruction in the PIC-implementations 
was not writing correctly its value in the register file in a corner case due to an 
erroneous simplification of a condition during synthesis. 

We verified the Alpha-example with the test for changed store order switched off 
and the DLX-example also without the checks for overwritten stores. The com- 
putation time changed only less than one second, which demonstrates that the 
overhead introduced by testing for complex re ad/ store-schemes in the equiva- 
lence detection is acceptable. 
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8 Conclusion 

Symbolic simulation must be able to handle memory operations to make formal 
verification of designs with complex control logic possible. The verification tool 
has to cope with two aspects: Firstly, the in general large sizes of the memo- 
ries. We argue only about memory operations, i.e, store- and read-operations. 
Therefore, the symbolic simulator has to detect equivalences of memory opera- 
tions in order to model correctly the behavior of the memory. 

Secondly, indirect addressing has to be considered. This makes a reasoning pro- 
cess about the relationships of the addresses during the symbolic simulation 
necessary, since they can be arbitrary symbolic terms. Collecting equivalent 
symbolic terms in equivalence classes permits us to establish a fast address com- 
parison for our equivalence detection methods. The new technique makes an 
efficient automatic equivalence checking of descriptions with complex reorder- 
ings of memory operations possible. 
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Abstract. Unfold/fold transformation systems for logic programs have 
been extensively investigated. Existing unfold/fold transformation sys- 
tems for normal logic programs allow only Tamaki-Sato style folding us- 
ing clauses from a previous program in the transformation sequence: i.e., 
they fold using a single, non-recursive clause. In this paper we present 
a transformation system that permits folding in the presence of recur- 
sion, disjunction, as well as negation. We show that the transformations 
are correct with respect to various semantics of negation including the 
well-founded model and stable model semantics. 

1 Introduction 

Unfold/fold transformation systems for logic programs have been used for auto- 
mated deduction [8, 17], and program specialization and optimization [2, 4, 10, 
15]. Normal logic programs consist of definitions of the form A:— (f> where A is an 
atom and ^ is a boolean formula over atoms. Unfolding replaces an occurrence of 
A in a program with ^ while folding replaces an occurrence of 0 with A. Folding 
is called reversible if its effects can be undone by an unfolding, and irreversible 
otherwise. 

Given a logic program P, an unfold/fold transformation system generates a 
sequence of programs P = Pq,Pi, • • • ,Pn, such that for all 0 < i < n, P^+i 
is obtained from Pi by applying one of the two transformations. Unfold/fold 
transformation systems are proved correct by showing that all programs in any 
transformation sequence Pq,Pi, . . . , Pn are equivalent under a suitable seman- 
tics, such as the well-founded model semantics for normal logic programs. A com- 
prehensive survey of research on logic program transformations appears in [14]. 

As an illustration of unfolding/ folding, consider the sequence of normal logic 
programs in figure 1. In the figure, Pi is derived from Pq by unfolding the 
occurrence of q(X) in the first clause of Pq. Program P 2 is derived from Pi by 
folding the literal q(Y) in the body of the second clause of p/1 into p(Y) by 
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CDA-9805735 and EIA-9705998. 
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using p(X) q(X) in Pq. This clause from a previous program which is used 
in a folding step (the clause p(X) : - q(X) of Pq in this case) is called the folder. 

An unfold/fold transformation system for definite logic programs was first 
described in a seminal paper by Tamaki and Sato [20]. It allows folding us- 
ing a single clause only [conjunctive folding) from the initial program. This 
folder clause is required to be non-recursive, but need not be present in the 
current program Pi. Maher [12] proposed a transformation system using only 
reversible folding in which the folder clause is always drawn from the current 
program. However, reversibility is a restrictive condition that limits the power 
of unfold/fold systems by disallowing many correct transformations, such as the 
one used to derive P 2 from Pi in Figure 1. Hence, there was considerable interest 
in developing irreversible unfold/fold transformation systems, for both definite 
and normal logic programs. 

Existing unfold/fold transformation systems for normal logic programs [1, 
13, 18, 19] are either extensions of Maher’s reversible transformation system [12] 
or the original Tamaki-Sato system [20]. Even for definite logic programs, irre- 
versible transformations of programs were, until recently, restricted to either fold- 
ing using non-recursive clauses (see [7]) or a single recursive clause (see [9, 21]). 
In [16] we proposed a transformation framework for definite logic programs which 
generalized the above systems by permitting folding using multiple recursive 
clauses. Construction of such a general transformation system for normal logic 
programs has remained open. Below, we describe a solution to this problem. 

Overview of the results: The main result of this paper is a unfold/fold 

transformation system that performs folding in the presence of recursion, dis- 
junction as well as negation (see Section 2). The transformations of [16] associates 
counters with program clauses (a la Kanamori and Eujita [9]) to determine the 
applicability of fold and unfold transformations. In this paper, we extend this 
book-keeping to accommodate negative literals. We show that this extension is 
sufficient to guarantee that the resulting transformation system preserves a va- 
riety of semantics for normal logic programs, such as the well-founded model, 
stable model, partial stable model, and stable theory semantics. Central to this 
proof is the result due to Dung and Kanchanasut [6] that preserving the se- 
mantic kernel of a program is sufficient to guarantee the preservation of the 
different semantics for negation listed above. However, in contrast to [1] where 
this idea was used to prove the correctness of Tamaki-Sato style transformations. 



p(X):- q(X). 
q([]). 

q([X|Y]):- ^r(X),q(Y). 

Program Pq 



p([]). 

p([X|Y]):- ^r(X),q(Y). 
q([]). 

q([X|Y]):- ^r(X),q(Y). 

Program Pi 



p([]). 

p([X|Y]):- ^r(X),p(Y). 
q([]). 

q([X|Y]):- ^r(X),q(Y). 
Program P 2 



Fig. 1. Example of an unfold/fold transformation sequence 
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we present a two-step proof which explicitly uses the operational counterpart of 
semantic kernels (see Section 3). In the first step of our proof, we show that 
the transformations preserve positive ground derivations, which are derivations 
of the form A ^^ 2 , . . . , such that there is a proof tree rooted at 

A with leaves labeled ^Bi through ^B^ (apart from true). We then show that 
preserving positive ground derivations is equivalent to preserving the seman- 
tic kernel of the program. Thus positive ground derivations are the operational 
analogues of semantic kernels. 

This proof suggests that we can treat the negative literals in a program as 
atoms of new predicates defined in a different (external) module. The correctness 
of the transformation system is assured as long as the transformations respect 
module boundaries (see Section 4). This observation indicates how a transfor- 
mation system originally designed for definite logic programs (such as the one 
we proposed in [16]) can be readily adapted for normal logic programs. 

2 The Transformation System 

Below we present our unfold and fold transformations for normal logic programs. 
In the following we assume familiarity with the standard notions of terms, substi- 
tutions, unification, atoms, literals. We will use the following symbols (possibly 
with primes and subscripts): P to denote a normal logic program; C and D for 
clauses; A, B to denote atoms ; T, AT to denote literals ; Af to denote sequence 
of literals and a, 0 for substitutions. 

In any transformation sequence Pq , , • • • , Pn we annotate each clause C in 

program Pi with a pair (Tto(C'), tL(C')) where tL(C'), tL(C') e Z and < 

7 ^^(C). Thus, and 7 ^- are functions from the set of clauses in program Pi 
to the set of integers Z. The transformation rules dictate the construction of 
and from and 7 ^^. We assume that for any clause C in the initial 
program Pq, 7 /q(C) = 7 ^^(C) = 1. Intuitively, ")\q{C) and 7 ^^(C) for a clause 
C are analogous to the Kanamori-Fujita- style counters [9]; the separation of hi 
and lo permits us to store estimates of the counter values in the presence of 
disjunctive folding. 

Rule 1 (Unfolding) Let (7 be a clause in Pi and A a positive literal in the 
body of C . Let Ci, ..., Cm be the clauses in Pi whose heads are unifiable with A 
with most general unifiers cri,...,a^. Let Cj be the clause that is obtained by 
replacing Aaj by the body of Cjaj in Caj (1 < j < m). 

Then, assign P^+i := (Pi - {C}) U {C(, C(^j. Set T;7(7) = do(^') + 
"JioiCj) and = J^iiC) + lhi{Cj). The annotations of all other clauses 

in Pi+i are inherited from Pi. □ 

Rule 2 (Folding) Let {Ci, ..., Cm} be clauses in Pi and {Pi, ..., Pm} be clauses 
in Pj (j < i) where Ci denotes the clause A:— P/q, . . . , ... , and Di 

denotes the clause P/:— iF/q, . . . Also, let 

1 . VI < / < m < k < ni Li^k = where a/ is a substitution. 

2. PiCTi = B2(J2 = ... = Bm(7m = B 
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3. Di, Dm are the only clauses in Pj whose heads are unifiable with B. 

4. VI < / < m a/ substitutes the internal variables of Di to distinct variables 
which do not appear in {A, 5, 

5. VI < / < m + Number of positive literals in the sequence 

Then, assign := (P— {Ci, Cm})U{(7'} where C = A:— P, L[, Set 
7io^(C') = - -pDi)) and TftF(e') = maa;i<,<„(T*^(0) - 

Ji^(Di)). The annotations of all other clauses in P^+i are inherited from Pi^ □ 

An Example: The following example (derived from [7]) illustrates the use of 
our basic unfold/fold transformation system. 

Cl : in_position(X,L) in_odd(X,L), -i even(X) . (1,1) 

C 2 : in_position(X,L) in_even(X, L) , -1 odd(X) . (1,1) 



C 3 : in_odd(X, [X|L]) . (1,1) 

C 4 : in_odd(X, [Y,Z|L]) : - in_odd(X,L) . (1,1) 

C 5 : in_even(X, [Y,X|L]) . (1,1) 

Ce : in_even(X, [Y,Z|L]) : - in_even(X, L) . (1,1) 



In the above program, in_odd(X,L) (in_even(X, L) ) is true if X appears in an 
odd (even) position in list L. Thus, in_position(X,L) is true if X is in an odd 
(even) position in list L, and X is not an even (odd) number. The odd/1 and 
even /1 predicates are encoded in the usual way and are not shown. 

Unfolding in_odd(X,L) in C\ we get the following clauses: 

C 7 : in_position(X, [X I L] ) ^ even(X) . (2,2) 

Cg : in_position(X, [Y, Z I L] ) in_odd(X,L), ^ even(X) . (2,2) 

Unfolding in_even(X,L) in C 2 yields the following clauses: 

Cg : in_position(X, [Y,X|L] ) : ^ odd(X) . (2,2) 

Cio : in_position(X, [Y, Z I L] ) in_even(X, L) , -1 odd(X) . (2,2) 

Finally, we fold clauses {Cg, Cm} using the clauses {Ci, C 2 } from the initial 
program as the folder to obtain the following definition of in_position/l. 

C 7 : in_position(X, [X I L] ) -1 even(X) . (2,2) 

Cg : in_position(X, [Y,X|L]) : ^ odd(X) . (2,2) 

C\i : in_position(X, [Y, Z I L] ) in_position(X,L) . (1,1) 

Note that the final step is an irreversible folding in presence of negation that uses 
multiple clauses as the folder. Such a folding step is neither allowed in Tamaki- 
Sato style transformation systems for normal logic programs [1, 18, 19] nor in 
reversible transformation systems [13]. 

Remark: We can maintain more elaborate book-keeping information than inte- 
ger counters, thereby deriving more expressive unfold/fold systems. For instance, 
as in the SCOUT system described in [16], we can make the counters range over 
use a tuple of integers, and obtain a system that is strictly more powerful than 
the existing Tamaki-Sato- style systems [20, 21, 9, 18, 19, 7, 1]. The construction 
parallels that of the SCOUT system in [16]; details are omitted. 
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3 Proof of Correctness 

In this section, we show that our unfold/fold transformation system is correct 
with respect to various semantics of normal logic programs. This proof proceeds 
in three steps. First, we introduce the notion of positive ground derivations and 
show that it is preserved by the transformations. Secondly, we show that preserv- 
ing positive ground derivations is equivalent to preserving semantic kernel [6]. 
Finally, following [1], preserving semantic kernel implies that the transformation 
system is correct with respect to various semantics for normal logic programs 
including well-founded model, stable model, partial stable model, and stable 
theory semantics. We begin with a review of semantic kernels. 



3.1 Semantic Kernel of a Program 

Definition 1 (Qusisi- Interpretation) [6, 1] A quasi-interpretation of a nor- 
mal logie program P is a set of ground elauses of the form A:— -<5i, . . . , 

(n > 0) where A^B\^. . . , Bn are ground atoms in the Herhrand Base of P. 

Quasi-interpretations form the universe over which semantic kernels are defined. 
For a given normal logic program P, the set of all quasi- interpretations of P 
(denoted QI{P)) forms a complete partial order with a least element (the empty 
set (p) with respect to the set inclusion relation C. 

Definition 2 Given a normal logie program P, let Gnd{P) denote the set of 
all possible ground instantiations of all elauses of P. The funetion Sp on quasi- 
interpretations of P is defined as 

Sp : QI{P) ^ Q/(P) 

Sp{I) = {P(C,Pi,... ,Dm) I C G Gnd{P)ADi G /, 1 < i < m} 
where, if Dfil < i < m) are ground elauses 

Ap— , ^Bi^m {ni > 0) 

and Al, . . . , Am{m > 0) are the only positive literals appearing in the body of 
ground elause G, then lZ{G,Di , . . . ,Dm) is the elause obtained by resolving the 
positive body literals Ai, . . . , Am in G using elauses Pi, . . . , Dm respeetively. □ 

If P is a definite program, then the function Sp is identical to the logical con- 
sequence operator Tp [11]. The semantic kernel of the program P is defined in 
terms of Sp as: 

Definition 3 (Semantic Kernel) [6, 1] The semantic kernel of a normal logie 
program P, denoted by SK(P), is the least fixed point of the function Sp, i.e., 

SK{P) = ^here SK^{P) = <p and 5P^+i(P) = Sp{SK^{P)) 
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Example : Consider the following normal logic program P: 

p -n q, r. 

r : - -1 r . 

The semantic kernel of P will be computed as follows. 

SK^{P) = {}. 

SK^P) = Sp{SK^{P)) = { (r - r) } 

SK^P) = SpiSK^P)) = { (r - r), (p - q, - r) } 

SK^{P) = Sp{SK^{P)) = SK^{P) 

Therefore, SK{P) = { (r -i r), (p -i q, r) } 

The following theorem from [1] formally states the equivalence of P and SK(P) 
with respect to various semantics of normal logic programs. 

Theorem 1 [Aravindan and Dung] Let P be a normal logic program and SK{P) 
be its semantic kernel. Then : 

(1) If P is a definite logic program^ then P and SK(P) have the same least 
Herhrand Model. 

(2) If P is a stratified program^ then P and SK(P) have the same perfect model 
semantics. 

(3) P and SK{P) have the same well-founded model. 

(4) P ci'^d SK{P) have the same stable model(s). 

(5) P and SK{P) have the same set of partial stable models. 

(6) P and SK{P) have the same stable theory semantics. 

3.2 Preserving the Semantic Kernel 

We now show that in any transformation sequence Pq, • • • ^Pn where VO < 
i < n Pi-\-i is obtained from Pi by applying unfolding (rule 1) or folding (rule 2), 
the semantic kernel is preserved, i.e., SK{Pq) = SK{Pi) = . . . = SK{Pn). To 
do so, we introduce the following notion of a positive ground derivation: 

Definition 4 (Positive ground derivation) A positive ground derivation of 
a literal in a normal logic program P is a tree T such that: (1) each internal 
node of T is labeled with a ground atom (2) each leaf node of T is labeled with 
a negative ground literal or the special symbol true, and (3) for any internal 

node A of T , A:— Li, . . . ^Ln must be a ground instance of a clause in program 

P where Pi, . . . , Pn cire the children of A in T. 

Thus, consider any positive ground derivation T in program P. Let the root of T 
be labeled with the ground literal P and let M be the sequence of negative literals 
derived in T, i.e., Af is formed by appending the negative literals appearing in 
the leaf nodes of T from left to right. Then we say that P derives W in P, and 
denote such derivations hy L '^p Af (and L Af if P is obvious from the 
context). We overload this notation, often denoting existence of such derivations 
also hy L '^p Af. Note that if P is a ground negative literal, there is only one 
positive ground derivation for P in any program, namely the empty derivation 
L L. We now define: 
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Definition 5 (Weight of a positive ground derivation) Let L '^p J\f be a 

positive ground derivation. The number of internal nodes in this derivation (i.e. 
the number of nodes labeled with a ground positive literal) is called the weight of 
the derivation. 

Definition 6 (Weight of a pair) Let • • • ^Pn be a transformation se- 

quence of normal logic programs. Let L be a ground literal, M be a (possibly 
empty) sequence of ground negative literals s.t. L Af. Then, the weight of 
{L,Af), denoted by w{L,Af), is the minimum of the weights of positive ground 
derivations of the form L '^p^ Af. 

Note that for any program Pi in the transformation sequence, the weight of 
any pair w{L,Af) is defined as the weight of the smallest derivation L '^p^ Af. 

Definition 7 Let Pq,Pi, ...,Pn be a transformation sequence of normal logic 
programs. A positive ground derivation L '^p. Af is said to be weakly weight- 
consistent if for every ground instance A:— L\, ...,Lk of a clause C used in this 
derivation, we have w{A,AfA) < + where AfA.Afi, ...,Afk 

are the sequence of negative literals derived from A, L\, ..., Lk in this derivation. 



Definition 8 Let Pq,P\, ...,Pn be a transformation sequence of normal logic 
programs. A positive ground derivation L '^p. Af is said to be strongly weight- 
consistent if for every ground instance A:— Li, ...,L^ of a clause C used in this 
derivation, we have 

- w{A,Na) >lio{C) + Y.i<i<k^{Li,Ni) 

- Ml <l <k w{A,AfA) > yo{LpAfi) 

where A/a^A/i, ^he negative literal sequences derived from A,Li, L^ 

in this derivation. 



Definition 9 (Weight consistent program) Let Pq,P\, . . . ,Pn be a trans- 
formation sequence of normal logic programs. Then, program Pi is said to be 
weight consistent if 

— for any pair {L,Af), whenever L derives Af in Pi, there is a strongly weight 
consistent positive ground derivation L 'W p. M. 

— every positive ground derivation in Pi is weakly weight consistent. 

Using the above definitions, we now state certain invariants which always 
hold after the application of any unfold/fold transformation. 

— Il{Pi) = MLMAf {L derives Af \n Pq L derives Af in Pi). 

— I 2 {Pi) = Pi 18 di weight consistent program 

We now show that these invariants are maintained after every unfolding and 
folding step. This allows us to claim that the set of positive ground derivations 
of Pq is identical to the set of positive ground derivations of program Pi. 
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Lemma 1 If (Vj < i Il{Pj)) holds, then \/L\/J\f ( L derives J\f in ^ L 
derives M in Pi) 

Lemma 2 (Preserving Weak Weight Consistency) Let Pq, ...,Pi,Pij^\ he 

an unfold/fold transformation sequence s.t. VO < j < i I^{Pj) A I2{Pj). Then, 
all positive ground derivations of are weakly weight consistent. 

The proofs for both Lemma 1 and 2 follow by induction on the weight of positive 
ground derivations in P^+i . We now establish the main theorem concerning the 
preservation of positive ground derivations in a transformation sequence. 

Theorem 2 Let Pq, Pi, . . . he a sequence of normal logic programs where P^+i 
is obtained from Pi hy applying unfolding (rule 1) or folding (rule 2). Then 
Vi > 0 Il{Pi) M2{Pi). 

Proof : The proof proceeds by induction on i. For the base case, /l(Po) holds 
trivially, and /2(Po) holds because every positive ground derivation of Pq is 
weakly weight consistent, and for any pair {L,Af) the smallest positive ground 
derivation L Af is strongly weight consistent. 

For the induction step, we need to show /l(P^+i) A /2(P^+i). By Lemma 1 
we have L J\f ^ L '^p. Af, and by Lemma 2 we know that all positive 

ground derivations of P^+i are weakly weight consistent. We need to show that 
(i)L p Af ^ L Af, and (ii) for any pair {L,Af) s.t. L p. Af, there 

exists a strongly weight consistent derivation L Af in Pj+i- Thus, it suffices 
to prove that for any pair {L,Af) s.t L '^p. Af , there exists a strongly weight 
consistent derivation L p. N. 

Consider a pair {L,Af) such that L '^p. Af. Since Pi is weight consistent, 
therefore there exists a strongly weight consistent derivation L Af in Pi. Let 
this be called Dr. We now construct a strongly weight consistent derivation 
Dr' = L Af. Construction of Dr' proceeds by induction on the weight 

of {L,Af) pairs. The base case occurs when P is a negative literal, Af = L and 
w{L,Af) = 0. We then trivially have the same derivation L Af in P^+i as well. 
Otherwise if P is a positive literal, let C be the clause used at the root of Dr. Let 
P:— Pi, . . . ,Ln be the ground instantiation of C used at the root of Dr. Since 
Dr is strongly weight consistent w{L,Af) > w{LpAfi) where Afi is the sequence 
of negative literals derived by P/ for all 1 < I < n. Hence, we have strongly 
weight consistent derivations Li Af\. We construct Dr' by considering the 

following cases : 

Case 1: (7 is inherited from Pi to Pi+i 

Dr' is constructed with the clause P:— Pi, . . . , Pn at the root and then append- 
ing the derivations Li Afi for all 1 < / < n. This derivation Dr' is strongly 

weight consistent. 

Case 2: C is unfolded. 

Let the Pi be the positive body literal of C that is unfolded. Let the clause used 
to resolve Pi in the derivation Dr be C\ and the ground instance of C\ used be 
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Li'.— Ai^i, . . . , By definition of unfolding L\— Ai^i, . . . , L 2 , . . . , is 

a ground instance of a clause C[ in Pi+i with = lloiC) + 'lloi^i)- Also, 

let A/ 1 , 1 , • • • , A/i,ife be the sequence of negative literals derived by Ai,i, . . . , 
in Dr. Then, by strong weight consistency < w{Li^Afi) < w{L,Af) 

for all 1 ^ I ^ k. Thus we have strongly weight consistent derivations Ti / 

A/i,/. We construct Dr' by applying L:— Ti,i, . . . , Ti,^, T 2 , ... , at the root 
and then appending the strongly weight consistent derivations Ti,/ A/i,/ 

(for all 1 < I < k) and Li Afi (for all 2 < / < n). Since Dr is strongly 

weight consistent, therefore 

W{L,N) > tL(C 1) + El<Kn«'(A;,V) 
and w{Li,J\fi) > Tto(C'i) + Ei<;<fe M,;) 

^ w{L,M)>ji+\C[) + 

^l<l<k w{L\,uMl,l) + E2<i<n^(A;W0 

This shows that Dr' is strongly weight consistent. 

Case 3: C is folded 

Let C (potentially with other clauses) be folded, using folder clause (s) from 
Pj{j clause C' in P^+i. Assume that Ti, . . . ^Lk are the instances of the 

body literals of C which are folded. Then, C' must have a ground instance of 
the form L : Tfc+i, • • • , Tn, where P:— Ti, . . . , is a ground instance of a 

folder clause D in Pj. Since, we have derivations Li '^p. Afi for all 1 < / < /c, 
therefore by Il{Pi) All{Pj) there exist derivations Li '^p. Afi. Then, there exists 
a derivation B '^p. A/p where A/p is obtained by appending the sequences 
A/i,... ,A/fc. Since Pj is a weight consistent program, this derivation must be 
weakly weight consistent, and therefore u;(P, A/p) < 7^^(P) + 

By strong weight consistency of Pr, we have 

w{L,N)>lio{C)+ Y. E 

l<l<k k-\-l<l<n 

>iuc)+w{BMB)--im+ E w 

A;+l</<n 

> u;(P,A/p) (by condition (5) of folding) 

Hence there exists a strongly weight consistent derivation B A/p. We 

now construct Dr' with T:— P, . . . , at the root and then append- 
ing below the strongly weight consistent derivations P A/p,P^+i 

A/fc+i, ... ,Pn A/^n- To show that Dr' is strongly weight consistent, note 

that ^ tL(^) “ since C and D are folded and folder clauses. 

Combining this with (*), 

w{L,M)>jlAiCj + w{B,J\fB)+ E 

k-\-l<l<n 



□ 



This completes the proof. 
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Thus we have shown that all positive ground derivations are preserved at 
every step of our transformation. Now we show how our notion of positive ground 
derivations directly corresponds to the notion of semantic kernel. Intuitively, this 
connection is clear, since a clause in the semantic kernel of program P is derived 
by repeatedly resolving the positive body literals of a ground instance of a clause 
in P until the body contains only negative literals. Formally, we prove that : 

Theorem 3 Let P he a normal logic program and A, 5i,... ^ Bn{n > 0) be 
ground atoms in the Herhrand base of P. Let Af be the sequence . . . , ^B^- 
Then, A derives Af in P iff {A:- A/) G SK{P) 

Proof Sketch: We prove A Af ^ ^ (P) by strong induction on 

the weight {i.e. the number of internal nodes, refer definition 5) in the derivation 
A '^P Af. The proof for {A:— Af) G SK{P) ^ A '^p Af follows by fixed-point 
induction. □ 

We can now prove that the semantic kernel is preserved across any unfold/fold 
transformation sequence. 

Corollary 3 (Preservation of Semantic Kernel) Suppose Pq,. . . ,Pn is a 

sequence of normal logic programs where is obtained from Pi by unfold- 

ing (Rule 1) or folding (Rule 2). Then VO < i < n SK{Pi) = SK{Pq). 

Proof: We prove that SK{Pq) = SK{Pi) for any arbitrary i. By Theorem 2 
we know that A '^p^ Af ^ A p. A/" for any ground atom A and sequence of 
ground negative literals Af. Then, using Theorem 3 we get (A:— Af) G SK (Pq) ^ 
(A:- Af) G SK(Pi). Thus, SK(Pq) = SK(Pi). □ 

Following Theorem 1 and Corollary 3 we have: 

Theorem 4 (Correctness of Unfolding/Folding) Let Pq, . . . ,Pn be a se- 
quence of normal logic programs where is obtained from Pi by an applica- 

tion of unfolding (Rule 1) or folding (Rule 2). Then, for all 0 < i < n we have 

(1) If Pq is a definite logic program, then Pq and Pi have the same least Herhrand 
Model. 

(2) If Pq is a stratified program, then Pq and Pi have the same perfect model 
semantics. 

(3) Pq and Pi have the same well-founded model. 

(4) Pq and Pi have the same stable model(s). 

(5) Pq and Pi have the same set of partial stable models. 

(6) Pq and Pi have the same stable theory semantics. 

4 Discussion 

In this paper we have presented an unfold/fold transformation system, which to 
the best of our knowledge, is the first to permit folding in the presence of recur- 
sion, disjunction, as well as negation. Such a system is particularly important 
for verifying temporal properties of parameterized concurrent systems (such as 
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a n-bit shift register for any n) using logic program evaluation and deduction 
[5, 17], 

The transformation system presented in this paper can be extended to incor- 
porate a goal replacement rule which allows the replacement of a conjunction of 
atoms in the body of a clause with another semantically equivalent conjunction 
of atoms provided certain conditions are satisfied (which ensure preservation of 
weight consistency) . In future, it would be interesting to study how we can per- 
form multiple replacements simultaneously without compromising correctness 
(as discussed in [3]). 

Apart from the transformation system, the details of the underlying cor- 
rectness proof reveal certain interesting aspects generic to such transformation 
systems. First of all, our proof exploits a degree of modularity that is inher- 
ent in the unfold/fold transformations for logic programs. Consider a modular 
decomposition of a definite logic program where each predicate is fully defined 
in a single module. Each module has a set of “local” predicates defined in the 
current module and a set of “external” predicates used (and not defined) in the 
current module. It is easy to see that Lemma 1, 2 and Theorem 2 can be mod- 
ified to show that unfold/fold transformations preserve the set of local ground 
derivations of a program. We say that A 5i, .02, • • • , Bn is a local ground 
derivation (analogous to a positive ground derivation), if each Bi contains an 
external predicate, and there is a proof tree rooted at A whose leaves are labeled 
with Bi, . . . ^Bn (apart from true). Consequently, transformations of a normal 
logic program P, can be simulated by an equivalent positive program module Q 
obtained by replacing negative literals in P with new positive external literals. 
The newly introduced literals can be appropriately defined in a separate module. 
Thus any transformation system for definite logic programs that preserves local 
ground derivations also preserves the semantic kernels of normal logic programs. 

Secondly, we showed that positive ground derivations form the operational 
counterpart to semantic kernels. This result, which makes explicit an idea in the 
proof of Aravindan and Dung [1], enables the correctness proof to be completed 
by connecting the other two steps: an operational first step, where the mea- 
sure consistency technique is used to show the preservation of positive ground 
derivations and the final mo del- theoretic step that applies the results of Dung 
and Kanchanasut [6] relating semantic kernels to various semantics for normal 
logic programs. 

Semantic kernel is a fundamental concept in the study of model-based se- 
mantics. By it very nature, however, semantic kernels cannot be used in proving 
operational equivalences such as finite failure and computed answer sets. The im- 
portant task then is to formulate a suitable operational notion that plays the role 
of semantic kernel in the correctness proofs with respect to these equivalences. 
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Abstract. We show that there is a class C of finite structures and a 
PTIME quantifier Q such that 

(1) IFP{Q) is bounded on C but (Q) ^ PFP{Q) ^ IFP{Q) = 
FO{Q) over C. 

(2) For all k > 2, IFP^{Q) is bounded but not uniformly bounded over 
C. {IFP^{Q) denotes the /c-variable fragment of IFP{Q)) 

(3) For all k > 2, IFP^{Q) is not uniformly bounded over C but 
IFP^iQ) = L^{Q) over C. 



1 Introduction 

First order logic (FO) has limited expressive power on finite structures. There- 
fore, fixed point logics, such as LFP and PFP have been studied widely in finite 
model theory. On ordered structures LFP expresses exactly PTIME queries but 
on unordered structures its expressive power is limited and is hard to character- 
ize. 

In the study of expressive power of LFP and other fixed point logics, an 
important role is played by an infinitary logic with finite variables, denoted 
^ . LFP can be viewed as a fragment of ^ . To understand the relative 
expressive power of FO, LFP and ^ on subclasses of finite structures, McColm 

made two conjectures in [10]. These conjectures are centered around the notion 
of boundedness for fixed point formulae over a class of finite structures. Let C be 
any class of finite structures. McColm’s first conjecture states that LFP collapses 
to FO on C iff LFP is bounded on C . McColm’s second conjecture states that 
^ collapses to FO on C iff LFP is bounded on C . 

The first conjecture was refuted in [5] and the second conjecture was con- 
firmed in [8]. Further, in [9] ramified versions of these conjectures, that is anal- 
ogous questions for fixed variable fragment of these logics, have been studied. 
For this a suitable definition of /^-variable fragment of LFP, LPP^ is formulated. 
The notion of boundedness for LPP^ over a class of structures can be defined 
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in the usual manner, by requiring bounded induction for each LFF^ formula 
(or system of formulae). However for LFF^ a stronger notion of bounded- 
ness called uniform boundedness [9] can also be defined. Uniform boundedness 
requires that there is a constant which bounds the number of inductive stages 
over C for all systems of formulae in LFP^. It is shown in [9] that on any class 
C of finite structures, ^ collapses to on C iff LFF^ is uniformly bounded 
on C. 

After some successful development of model theory of finite variable logics 
in recent years, for example as witnessed by elegant results mentioned above, 
it is natural to examine if these results also hold for more general logics or if 
the techniques developed there can also be applied to richer contexts. Recently, 
extensions of first order, fixed point and infinitary logics with generalized quan- 
tifiers have been studied extensively [7,2,3]. In this paper, we study questions 
analogous to McColm’s conjectures and ramified versions of these for logics with 
generalized quantifiers. 

We show that McColm’s second conjecture cannot be extended to logics 
with arbitrary generalized quantifiers whereas a ramified version of the second 
conjecture does hold for arbitrary finite set of generalized quantifiers. Our main 
result is a construction which disproves extension of McColm’s second conjecture 
for logics with generalized quantifiers. We construct a generalized quantifier Q 
and a class C of finite structures such that IFF(Q) is bounded over C but the 
number of L‘^{Q) types realized is unbounded over C. This in turn implies that 

a; iQ) 7^ F0(Q) over C. Further, we can also construct a PTIME computable 
generalized quantifier with the above properties. 

A byproduct of this construction is a different proof of a result in [3], that 
there is a PTIME computable generalized quantifier Q such that L^{Q) types 
cannot be ordered in IFF{Q). The construction of [3] does not imply the result 
mentioned above as the class C over which IFF{F) ^ FFF{F) (and hence 
FO(F) ^ ^ (P)) is established in [3] for a quantifier U, admits an unbounded 

induction because a linear order is explicitly defined on part of each structure 
in C. The linear order in that construction is used in an essential way. 

Eor the same class of structures C and PTIME, quantifier Q, used to disprove 
the extension of McColm’s second conjecture, we show that 1FF^{Q) is bounded 
but not uniformly bounded on C . To our knowledge, this is the first example 
in which the two notions of boundedness for (extensions of) fixed point logic 
are provably distinct. It also follows easily that 1FF^{Q) = L^{Q) on this class 
which disproves a version of ramification of the first conjecture with generalized 
quantifiers. It is interesting to note that these results hold for an extension of 
lEP which is still in PTIME. It should be noted that both these questions, 
boundedness vs. uniform boundedness and ramified version of McColm’s first 
conjecture are open when no generalized quantifiers are allowed. 

Our construction of the generalized quantifier is by diagonalization. Gener- 
ally, this process for constructing quantifiers can be thought of as similar to 
oracle constructions in complexity theory. However, the diagonalization process 
in constructing quantifiers is more involved than the usual oracle constructions 
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in complexity theory. One reason for this is that in constructing generalized 
quantifiers one does not operate just on the level of strings or a specific repre- 
sentation of a structure but on a more abstract level of the structure itself. If we 
include (exclude) a structure in the class defining a generalized quantifier then 
we also have to include (exclude) all structures isomorphic to it, in the class. 

In our construction we need to define an equivalence relation coarser than 
isomorphism on structures such that whenever we include (exclude) a structure 
in the class defining generalized quantifier then we also have to include (exclude) 
all structures related to it, in the class. To go through the construction we look 
into the structure of equivalence classes of this relation and find an invariant of 
an equivalence class. The technical machinery devloped to achieve this may be 
useful to construct generalized quantifiers in other contexts also. 



2 Preliminaries 

We assume the reader to be familiar with basic notions of finite model theory. 
In this section, we will present only some concepts and notations relevant to us. 
For a detailed introduction see, [4,1]. 

We consider finite structures over (finite) relational vocabularies. If denotes 
a structure then |2t| will denote its domain and ||2t|| will denote cardinality of 
the domain. 

We use 0 to denote disjoint union of two structures. That is if 2t, ® are a 
structures, then 2t0® is a a structure with domain {0}x|2t|U{l}x|®|. For any 
relation K or arity i in a and any di, . . . , G |2t 0 ®|, 0 ® |= K(di, . . . ,di) 

iff either (i) di = Oai , . . . ^di = Oa^ and |= R{a \^ . . . , a^) or (ii) d\ = 16i, . . . 

. . . ^di = Ibi and ® |= R{bi, . . . , bi). The operation 0 extends to more than two 
structures in an obvious way. 

Generalized Quantifiers: Let K be any class of structures over the vocabulary 
a = {Rij . . . jRm}^ where Ri has arity We associate with K the generalized 
(or Lindstrom) quantifier Qk- 

For a logic L, define the extension L(Qx) as the set of formulas arising by 
adding to the formulae formation rules of L, the following new rule for formulae 
formation: if . . . , <j)m are formulas of L{Qk) and xi, . . . ,Xm are tuples of 
variables (with the length of Xi being n^), then . . . , Xm{<i>i, • • • , <i>m) is a 

formula of L{Qx)- The type of this quantifier is the tuple (ni, . . . ,n^) and its 
arity is max(n^). 

The semantics of the quantifier is given by: 

21 1= QkXi,.. . . . ■ ,(f>m{xm,y))[a] if and only if 

(|2t|,<Af[a], . . . [a]) e where [a] = {6 e |2t|"-* | %\= (pi[h,a]}. 

Fixed point logics, infinitsiry logic and types: The definition of LFF (Least 
fixed point logic) and IFF (Inflationary fixed point logic), can be found in [4,1]. 
As a technical remark, we just mention that we do not allow any parameter 
variables in the definition of fixed point, this is convenient for our purposes and 
is not a real restriction, see [4, Lemma 7.1.10 (b), pp. 171]. 
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Both LFP and IFF are equivalent in expressive power (over finite structures) 
(see [6]). However, IFF is more robust in that it can be naturally extended with 
arbitrary generalized quantifiers, so we will use mostly IFF in this paper. 

We denote the /^-variable fragment of first order logic by L^. For the definition 
of infinitary logic, L^-types and L^(Q)-types, see [1]. 

McColm’s conjectures: A detailed exposition of McColm’s conjectures may 
be found in [1,8,9]. These conjectures are centered around the following definition 
of boundedness. 

Definition 1 Let S') he a first order formula positive in S over 

vocabulary a U {S}, where S is an n-ary relation symbol not in a. <j> is bounded 
over a class C of structures if there is an rn such that on every structure Ql e C, 
LFP iteration of <j> converges in < m steps. LFP is hounded over C if every first 
order formula <j> as above is hounded over C. 

Note that boundedness for IFF can be defined in the same way as has been 
defined for LFF and LFF is bounded on a class of finite structures iff IFF 
is bounded on this class. In [10], McColm made following conjecture relating 
boundedness of LFF over a class C to the relative expressive power of fixed 
point and infinitary logic over C . 

Conjecture Let C be a class of structures. The following are equivalent. 

(i) LFP is bounded over C. 

d) LFP = FO over C. 

(iii) = FO over C. 

The equivalence between (i) and (ii) above is termed as McColm’s first con- 
jecture in [8], and was refuted in [5]. The equivalence between (i) and (hi) is 
termed as McColm’s second conjecture in [8], where it was shown to hold. 
Bounded variable fragment of fixed point logics: In [9] the fc-variable 
fragment of LFF^ LFF^^ is defined as a system of positive L^ formulae. The 
semantics of this system is obtained by taking simultaneous fixed points of for- 
mulae of the system. Essentially the same definition works for IFF^{Q). We 
also define fragments of IFF^{Q) based on the number of components in the 
system. 

Definition 2 IFF^^'^{Q) denotes the fragment of IFF^{Q) obtained by taking 
systems of at most i formulae of L^(Q). 

Given a system of L^(Q) formulae, its closure function and change function 
is defined in exactly the same way as in [9]. We recall the definition of uniform 
boundedness of IFF^{Q) over a class C of structures as in [9]. 

Definition 3 Let k he a positive integer, C a class of a structures and Q a 
set of generalized quantifiers. IFF^{Q) is uniformly hounded on C if there is a 
positive integer mo such that for all I > 0, for every system S = (0i, . . . , 0/) of 
L^(Q) formulae and for every a -structure ^ e C we have that, ch{S)(fii) < rn^. 
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Following is the ramified version of McColm’s second conjecture, proved in 
[9], 

Theorem 1 [9] Let k he a positive integer. The following are equivalent over a 
class C of finite structures. 

1. LFF^ is uniformly hounded over C . 

2. over C . 

3 Constructing a Quantifier 

In this section we construct a class C of rigid structures over the vocabulary 
U = {R}^ consisting of one binary relation, and a quantifier F such that 1FF[F) 
is bounded on C but ^ {F) ^ FO{F) over C. The quantifier F is over the 
signature a consisting of one binary and two unary relations. 

We construct a sequence of rigid connected graphs SDli, SDI 2 , • • • , .... Let 

0i = soil 0 ... 0 dKi. The class C is defined as {0n|^ > !}• The quantifier F 
is obtained by diagonalization over a fixed enumeration {(j>i)ieuj of all IFF{F) 
formulae. Each &i has at least i many, LP‘{F) types and the number of iterations 
for each IFF occurrence in fii is bounded by (d.||0^||^ 0 d), for a fixed constant 
d, over the entire class C. For brevity, we use Tlij denote Tli 0 ... 0 Tlj. 

Following lemma gives some information about structures in class C. It can 
be easily proved using EF games. 

Lemma 1 Let k he a positive integer. 

Let Ck = {2ti 0 . . . 02tn I n>l, each satisfies k-extension axioms }. Then 
the number of L^ types realized in each structure ^ e Cj^ is hounded, that is, it 
is only a function of k and is independent of the structure. 

Proof: Easy. □ 

Lemma 2 Let k, rn/n be positive integers. Let 2t = ®o 0 0 ... 0 a 

structure such that ®i, . . . , 03^ satisfy k-extension axioms. Let ai G |2t|^ i < m. 
Suppose ai is extended to ( 01 , 02 ); where 02 G \%i 0 ... 0 02 has 

at least one element not in Oi. If ( 01 , 02 ) has type r in % then there are at least 
distinct extensions of a± which have type r in 2t. 

Proof: This can be easily proved using the definition of extension axioms and 
observing the structure of A:-types for each G in the proof of Lemma 1. 
□ 

Note that the restriction that 02 2 is needed otherwise there is a unique 
extension of oi to type r. 

A quantifier F can be seen as a collection {F^)neuj^ where F^ is the set of 
structures in F whose domain size is n. F^ can be viewed as an isomorphism 
closed class of structures which have domain of size n. 

For our construction below, we need to give a definition of a partial quantifier 
on structures of domain size n. Following definition of this is along the expected 
lines. We consider structures equal upto isomorphism. 




On L^{Q) Types and Boundedness of IFP{Q) on Finite Structures 339 



Definition 4 A total quantifier is a mapping from 

{2t \ Ql a a structure, ||2t|| = n} to {true, f alse} . A partial quantifier F^ is 
a mapping from {2t | a a structure, ||2t|| = n} to {true, false, ^} . The >i< can 
he thought of as ^undefined ^ element. The domain of a partial quantifier F^ is the 
pre-image of {true, false} under F^ . A partial quantifier Fifi extends a partial 
quantifier Ffi if Fifi restricted to the domain of Ffi is the same as Ffi . 

We will often use the following easy observation. Let <j){x) be a L{Fi) formula 
for a logic L and generalized quantifier F\ . If evaluation of on structure does 
not entail evaluation of a subformula of <j> beginning with Pi on a a structure 
which is outside the domain of Pi then for all quantifiers P 2 which extend Pi, 
<j){x) and ^(x)[P 2 /Pi] are equivalent over 2t. 

Consider a structure 2t = ®i 0 ® 2 - We wish to define subsets of |2t|^ whose 
projection over ®2 is closed under the relation over ® 2 - 

Definition 5 Let = ®i 0 ®2 cl structure. Let k he a positive integer and 
[k] = ,k}. Let tt C [A:] with |7t| = r. Let E C he an equivalence 

class of the relation on and let a G |®i|^. We call {E,a,7v) a k- 

triple over (®i,® 2 )- For such a triple {E,a,7v) we define a set S(^E,a., 7 r) = 
{b G |2t|^ I projection of b on tv is a and projection on [k] — n is in E}. Note 
that ifr = 0, then we just have {E,iv) as triple and S(^ee) — On the other 

hand if r = k then we have {a, tv) as triple and C |®i|^. 

A set T is A:-closed over ®2 if it is a union of sets of the form S^, where 
e ranges over some set of k -triples over (®i,® 2 )- 



Let (pi{xi, . . . , Xk),(p 2 {^i, • • • , • • • , ^k) be formulae (On a structure 

2t these can also be thought of subsets of |2t|^) and let be permutations 

on k}. We can associate with this data a formula <j> as follows. 

(j){x\ , . . . ,X]f) ^a(2) ? ^ (3 (1) 7 (1) (0l(^o;(l)? • • • ? ^ a ( k ) ) 7 

, 02 (^/5(1) 7 • • • 7 ^ j3{k) ) 7 03 (^"y(l) 7 • • • 7 ^'y(k) )) 

For a 0 as above and (ai, . . . , a^) G |2t|^, 0(ai, . . . , a^) naturally gives rise to 
a structure over vocabulary a of quantifier P. This structure is (|2t|, P, U, V), 

where R = {(61,62) | 2t ^ 0i(6i , 62, ac,(3) , • • • 

P = {fe I h 02(6,a^(2),---,a^(fc))}7 V = {b\Ql\= 0a(6, a^(2), • • • , a^(fc))} 

We will call (|2t|,P, U, V) the structure associated with 0(ai, . . . ,a^). Note 
that 2t 1= 0(ai, . . . , a^) iff P is true on (|2t|, P, U, V). 

Definition 6 Let 21 = ® 1 CD 2^2 he a Lj structure. IFc call two structures 2fi and 
212 over a to he (®i, ®2) related if there are 0i, 02, 03 defining k-closed sets over 
®2; permutations a,f3,j and tuples 61,62 G for some k-triple {E,a,7v) 

over (®i,®2) such that 2ti corresponds to 0(6i) and 2I2 corresponds to 0(62); 
where 4>{x) is as defined above. 

R is easily seen that transitive closure of the above relation is an equivalence 
relation. We denote it by ^o)ill drop the subscript when it is fixed 

by the context). By EQ{%i), we denote the equivalence class of this relation 
containing the structure 2ti. 
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We define the quantifier on = SDtij, in j steps. At step i < we 
make sure that IF F{F) formula <j>i^ can ‘access’ only sets which are closed over 
j). This will ensure that all inductions in <j>i are bounded in terms 

of \ 0 i\ only. 

The idea of EQ classes is introduced because if we start with A;-closed sets 
^1, ^2, 03 and wish that 0 constructed from generalized quantifier F using these 
sets also gives rise to A:-closed sets, then F should be assigned the same value on 
all elements of an EQ class. 

In order to extend the quantifier further, we need some information about 
the elements in the EQ class of a a structure to know which structures have 
been included in the domain of the quantifier. Also, we require many unrelated 
structures with respect to relation EQ^ so that including a structure in the 
domain of the quantifier still leaves enough structures outside the domain of the 
quantifier on which the value of the quantifier can be decided at later stages. 
We achieve this by requiring our structure 2 t = 0 ®2 to satisfy the following 

property. 

Definition 7 Let 0 ®2 he a strueture. We say that satisfies the 

k-extension property over %2, if for eaeh h\ G |2t|^ and 62 G |®2p; ^2 2 ^1? 
ifiFj < k, there are at least |®i|0l distinet c^s sueh that (61,62) (61, c) in 

2 t. In other words, if a k-type ean he extended to another k-type hy adding new 
elements then it ean he extended to that type in at least |®i| 0 1 distinet ways. 

The following Lemma gives some information about the structures in EQ{S) 
in terms of the substructures they may possess. We first define notion of an 
isolated substructure and a unique substruture of a cr structure which is suitable 
for our purpose. 

Definition 8 Let SDl = R,U,V) he a a strueture. Let S C we eall 
{S,Rn {S X S),U n S,V n S) an isolated suhstrueture of Tl, if | 5 | > 2 , there 
are no R edges between S and |SDl| — S' and eaeh element of S is eonneeted to 
at least one another element of S. The size or eardinality of this substructure is 

\s\. 



Definition 9 Let = (| 2 t|, A!, L, L) he a a strueture. A substructure 

2ti = (|2ti|, Li, Li) of ^ is a unique substructure 0 / if for any isolated 
substructure of % isomorphic to 2 ti = Ri,U[,V(), we have U\ = U[ and 

Vi = V[. In other words, up to an isomorphism, (| 2 ti|, has a unique extension 
to an isolated substructure of 2t. 



Lemma 3 Let 2 t = 0 be a structure such that 51 satisfies the k extenszon 

property over If a a structure R,U,V) has a unique substructure S), 
\S}\ < |®i|; then all structures {{"Oil, RfUfV') in t/, V) have 

a unique substructure isomorphic to 
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Proof: This nontrivial proof requires a careful analysis of substructures of two 
structures related by relation. Details are given in the full version. 

□ 

Definition 10 Let 0®2 be a structure and F a partial quantifier on 2t. 

We say that F is consistent over (®i, ®2); if for every EQ class r over (®i, ®2) 
and every €i^€2 ^ t in the domain of F either F is true both on €i , €2 or false 
on both €i,€2. 

A consistent partial quantifier F is complete over (®i,®2) if for each EQ 
class T over (®i,®2); P cither includes the entire class r in its domain or it 
includes no structure from r in its domain. 

Given a F consistent over (®i,®2); its least (®i,®2) complete extension 
F' is an extension of F with the smallest domain size which is (consistent and) 
complete over (®i,®2)* Note that it always exists. 

Definition 11 Let 0®2 be a structure and F a partial quantifier on 2t. 

We say that ^ (P) is A:-closed over ®2; if each L^ ^ (F) formula fi{x), 
whose evaluation on 2 t does not require a query to F outside its domain^ defines 
a subset of | 2 t|^ which is k-closed over ®2* (Note that even if has less than k 
free variables^ it can be thought of as defining a subset 0/ | 2 t|^ by adding dummy 
free variables). 

Lemma 4 Let 0 ®2 and let Q be a partial quantifier on which is 

consistent over (®i,®2)- Then L^^ (Q) is k-closed over ®2^ in the sense of 
definition 11 . 

Proof: This follows by an easy induction on ^ (Q) formulae. □ 

This Lemma confirms the intuition mentioned in the remarks after Definition 

6. 

Lemma 5 Let 0®2 be a rigid structure and let Q be a partial quantifier 

on such that L^ ^ (Q) is k-closed over ®2- Then for each 1 EF{Q) formula 
(f), with at most k variables, whose evaluation on does not query Q outside its 
domain, there is a polynomial such that in its evaluation on subformulae 
beginning with Q need to be evaluated at most Fcf,{\%i\,t) times, where t is the 
number of k -types realized in \%2\- Further, all IFF stages in <j> define k-closed 
sets over ®2- 

Proof: In full version. □ 

The point to note about the polynomial bound in the Lemma above is that 
it is polynomial in |®i| and not in | 2 t|. 

Finally, we are ready for the actual construction. As mentioned in the begin- 
ning of this section, we construct a class C of rigid structures, over the vocabulary 
E = {R}, consisting of one binary relation, and a quantifier F over a. 

The construction is by diagonalization over a fixed enumeration {(j>i)i^uj 
of all 1 EF{F) formulae. We construct a sequence of rigid connected graphs 
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Tli,Tl 2 , ^ ^ ^ Let 0i = Tli 0 ... 0 SDl^. The class C is defined as 

{0n|^ > 1}. In stage i, we come up with Wli and define Each a G \0i\ 

has a distinct L‘^{P) type. The number of stages in each IFF iteration of <j>i is 
bounded by (d.||0^||^0d), for a fixed constant d which may depend on <f>i alone, 
over the entire class C . 

For any j, let raj > 3 be such that all . . . , <j)j have at most raj variables 
and let Pj{x,y) be 0 ... 0 , where Fjy^^ / < j, is as in Lemma 5. We 

will assume, without loss of generality, that all our bounding polynomials are 
monotone. Using Lemma 1, let tm be the upper bound on the number of ra types 
realized in structures of class 

We now describe stage i in the construction of P^ C. Let n be such that for 
all m > — ra — m? > Pi{ra^ Choose SDl^ a rigid, connected graph of size 

> n which satisfies rai 0 m^.||0^_i||-extension axioms. This ensures the desired 
m^-extension property by Lemma 2. Let the size of be n^. Recall that we 
use the abbreviation for 0 ... 0 dRj 

Note that, it follows by our definition of isolated substructures and the sizes 
of various that for every j < i, (SDlij, U fl |SDlij|,U fl |SDli,j|) is a unique 

substructure of (0^, U, V). 

Let <mi 7 • • • 7 be canonical orderings on |SDli | , . . . , |SDli | respectively (such 
orderings can be defined in the second level of the polynomial hierarchy). 

We will define sets Ti^o C C . . . C such that Ti^o = 0, Tij C 

and \Tij\ > \Tij-i \ 0 2, 1 < j < i — 1. The quantifier pW^^W will be defined as 
true on a structures (of domain size ||0z||), which have a unique substructure as 
(at least) one of the structures, (SDl^y , {a} U {b} U Tij-i)^ where a <^. b 

and 1 <j<i. is false on other a structures (of domain size ||0z||). 

We claim that, for any such any subset S C \W^lj\ will 

be L‘^{F) definable. More precisely, for each S as above there is L‘^{F) formula 
(ps{x) such that for all a G |0^|, 0^ \= (ps{ci) iff a G N. We prove this by induction 
on j. 

Base case j = 1 : Note that L‘^{F) formula 

0{x^y) = Pxy^y^x(R{x^y)^y = x^x = y) when interpreted over 0^ defines a 
linear order on |9?li| and does not relate any other elements of 0^. (The clever 
cross-wiring of variables in the above formula is from [2, Lemma 4.9, pp. 179]). 
Using this order we can define any S C |9}li| by an LF‘{F) formula. 

Induction step j = A: 0 1: Assume that the induction hypothesis holds for 
j = A:, Therefore Ti^k is L‘^{F) definable. Let ^(y) be the formula defining 

Note that L‘^{F) formula 

0{x,y) = Pxy,y,x{R{x,y),y = x V ^(y),x = y V 3y[y = x ^ ^(y)]), when 
interpreted over 0^ defines a linear order on |SDtj^+i| and does not relate any 
other elements of |0^|. Using this order we can define any S C by an 

L‘^{F) formula. This completes the induction. 

Using the procedure below we define Ti^o/Fi^i ^ . . . , with the properties 

mentioned above and define the quantifier pW^^\\ in the following i steps. 

= 0; Ti^o = 0; (we drop the superscript in in the loop below) 
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For j = 1 to i do 

1. Let Pi^jf be the least such that, Pi^jf O Pi,j-i and Pi^jf is true on any a 
structure which has a unique substructure (SDlij, {a} U Tij-i^{b} U pj-i) 
for some a^b e |SDlj|, a b. 

Note that is complete over (SDlij, by Lemma 3. 

2. Consider the evaluation of (f)j{Pijf) on 0^, answering any query to 
outside its domain as false and extending the domain of P^^/ to reflect this. 
Let Pij" be the quantifier obtained after this evaluation. 

3. Since Pij^ is complete over (SDli,j , and we have only added structures 

with same truth values to the domain of Pij'^ quantifier Pij" is consistent 
over (SDlij, By Lemma 4, we get that every ^ {Pi,j") formula is 
A:-closed over (SDlij, All IFP iterations in <j>{Pi^jn) terminate in the 
number of stages bounded by the number of m^-triples over (SDlij, 

which is bounded by (2’^-^ .|SDli j 

4. By choice of Tlj and using Lemma 5, there is a T C \Tlj\^ \T\ > 2, such 
that Pijf' is not queried on any a structure which has a unique substructure 
{Tlij/Fij-i U T,Tij-i U T). We define Pj = Pj-i U T. Let Pij be Pijn . 

End For 

Finally, PH^^H is obtained by extending so that on all inputs outside 

the domain of pW^iW is set to false. We define P as Uzgu; • 

To see that 1FP{P) is bounded on C, consider <j>j in the enumeration of 
IFP(P) formulae. As shown above, for all i, all I FP(Pijf') iterations in (j>j on 
0i are bounded by (2^P\Tlij\'^Ftmj)- P extends Pij"^ so by an observation 
before, IFP{Pij") and IFP{P) operate identically on 0^. Therefore IFP{P) 
iterations on 0^ are also bounded by a constant independent of 0^. 

We have also seen above that each element in |0^| has a distinct L^(P) type. 
Therefore the number of L^(P) types realized over structures in C is unbounded. 
Hence we have proved the following. 

Theorem 2 There is a quantifier P and a class C of rigid structures, such that 
L‘^{P) -types realized over C are unbounded hut IFP{P) over C is bounded. 

Combining [2, Corollary 4.5, pp. 178] with Theorem 5, we immediately get 
the following corollary which shows that McColm’s second conjecture can not 
be extended in the presence of generalized quantifiers. 

Corollary 1 There is a quantifier P and a class C of rigid structures, such that 
IFP{P) is bounded on C but L^^^(P) ^ FO{P) over C. 

We can modify the construction presented above to make the quantifier in 
Theorem 2, PTIME computable. The idea is to pad each 0^ in that construction 
with a large clique. So each 0^ will now be replaced by Sji = Tli^i 0 91^, where 
Vli is a large enough clique. As a result we do not get a class of rigid structures 
this time. The construction goes through with some easy changes. Due to lack 
of space we leave details of this to the full version and only note the statements 
of the results below. 




344 Anil Seth 



Theorem 3 There is a FTIME computable quantifier Q and a class D of finite 
structures, such that L‘^{Q) -types realized over D are unbounded but 1FF{Q) 
over D is bounded. It follows that ^ FO{Q) over class D. 



Corollary 2 [3] There is a FTIMF computable quantifier Q such that is 
not IFF{Q) definable over the class of finite structures. 

4 Ramified Conjectures with Generalized Quantifiers 

By the standard techniques for obtaining normal forms for the fixed point logics 
(see ch. 7, [4]), it can be shown that IFF{Q) = UkIFF^{Q). The following 
generalization of Theorem 2.3 of [9] can be proved easily. 

Lemma 6 [9] Let k be a positive integer and let Q be a set of generalized quan- 
tifiers. Let (01, ... ,0/) be a system of T^(Q) formulae. Then the following hold 
for 1 < i < 1. 

(i) Each component of every stage = {Efi, . . . ra > 0, of the 

inflationary operator E is defined by an L^{Q) formula. 

(ii) Each component of the inflationary fixed point (^^, . . . , of the 
system is L^ o;(Q) -definable. As a result, we have that IFF^{Q) C o;(Q) • 



4.1 Relating Number of Aj-Types and Uniform Boundedness of 
LFP^ and IFP^ 

The following lemma shows that, for any given there is a A;- variable IFF^{Q) 
formula which can pass through as many distinct stages, during the computation 
of a fixed point, as the number of L^(Q)-types in 2t. 

Lemma 7 Let a be a vocabulary, k a positive integer and Q a finite set of 
generalized quantifiers. Let % be a structure over a realizing m, L^(Q) types. 
Then there is an IFF^d(^Q'^ system S such that ch{S)(fii) = m. 

Proof: Easy. See full version. □ 

Corollary 3 Let 01 be a structure realizing m L^ types. Then there is an LFF^^ 
system S such that ch{S)(0i) = m. 

Proof: Just observe that Y occurs only positively in the formula 0^ constructed 
in the proof of Lemma 7. □ 

Using Lemma 7 and Theorem 3, we get the following theorem. 

Theorem 4 There is a class D of structures and a FTIME computable quan- 
tifier Q such that for each k >2, IFF^{Q) over D is bounded but not uniformly 
bounded. 
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4.2 Second Conjecture 

Theorem 5 Let k he a positive integer, C a class of finite structures and Q a 
finite set of generalized, quantifiers. Then the following are equivalent. 

1. LyQ) = L^,,(Q) overC. 

2. IFF^{Q) is uniformly hounded on C. 

3. IFF^'^{Q) is uniformly hounded on C. 

Proof: See full version. □ 

Using Corollary 3, we also get a slightly stronger version of a result in [9] as 
follows. 

Corollary 4 Let k he a positive integer and C a class of finite structures. Then 
the following are equivalent. 

1- U = over C. 

2. LFP^ is uniformly bounded on C. 

3. LFF^’^ is uniformly hounded on C. 



4.3 First Conjecture 

Because the two notions of boundedness are different we have two versions of 
this conjecture. We disprove the version with uniform boundedness. 

Theorem 6 There is a PTIME computable quantifier Q and a elass D of strue- 
tures, sueh that IFP‘^{Q) is not uniformly hounded on D but for eaeh k > 2, 
IFP^iQ) = L^{Q) over D. 

Proof: Straightforward using boundedness of IFP{Q) and Lemma 6. □ 

The other version of the first conjecture remains an open question. 

Open: Let C be a class of finite structures and Q be a finite set of generalized 
quantifiers. Is IFP^{Q) = L^{Q) on C iff IFP^{Q) is bounded on C. 
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Abstract. We show that all minimal a-b separators (vertex sets) dis- 
connecting a pair of given non-adjacent vertices a and b in an undirected 
and connected graph with n vertices can be computed in 0{n?Rab) time, 
where Rab is the number of minimal a-b separators. This result matches 
the known worst-case time complexity of its counterpart problem of com- 
puting all a-b cutsets (edge sets) [13] and solves an open problem posed 
in [11]. 



Keywords: Algorithms, complexity analysis, cutsets, data structures, separators. 

1 Preliminaries 

A separator 5 in a connected graph G is a subset of vertices whose removal 
separates G into at least two connected components. An a-b separator in G is a 
separator whose removal disconnects vertices a and b in G. That is, removal of 
S results that a and b reside in two different connected components of G [5] . An 
a-b separator is minimal if it does not contain any other a-b separator. When a 
and b are two vertex sets A and B in V respectively, an a-b separator becomes 
more general A-B separator that disconnects A and B. 

Computing separators under various constraints [2,3,7] is closely related to 
determining (vertex) connectivity of a graph, which is a fundamental graph- 
theoretical problem with many important applications. The problem of comput- 
ing (enumerating) all minimal a-b separators in a graph is not only of theoretical 
interest but also of significance in applications such as scheduling problems and 
network reliability analysis [1,4,7]. Due to its importance, this problem has been 
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studied by many researchers within various contexts [2,4,7,8,11]. The current 
best known result is due to Shen et al [11] that computes all Rab minimal a-b 
separators in an n-vertex graph in 0{n^Rab) time^. It was posed as an open 
problem whether there is an algorithm that finds all minimal a-b separators in 
quadratic time in n per separator. That is, whether the complexity of finding 
a-b separators can match that for finding a-b cutsets^ where an a-b cutset is an 
edge set whose removal disconnects a and b [3,11]. 

In this paper, we give a firm answer to the above open problem by presenting 
an efficient data structure, namely the n-way bitwise ordered tree^ to maintain 
all distinct separators generated by level-to- level adjacent- vert ex replacement. 

First we present some notations and the level- by- level adjacent- vert ex re- 
placement approach proposed in [11] which we will also use later in this paper. 

Let G = (y, be an undirected connected simple graph. For any X, T C y, 
let X — y = X\y = {u G X\u ^ y}. For notational simplicity, sometimes 
we also use X(Y) to denote X — Y. We denote the subgraph indueed by the 
vertices of X by G[X] = (X, £^(X)), where E{X) = {{u,v) G E\u,v G X}, and 
a neighbourhood set of X by X(X) = {w e V — X\3v G X, (u, w) G E}. 

Given an a-b separator S defined previously, we denote the connected compo- 
nents containing a and b in G[V — 5] by Ca and Cb respectively. For any X C y. 
We define the isolated set of X, denoted by /(X), to be the set of vertices in X 
that have no adjacent vertices in Cb of G\V — X] and hence are not connected 
to Cb- 

Let d{x) be the length of the shortest path (distance) from vertex x to 
a. For any x G y, we define 

N^{x) = {v\{x,v) e E,v eV and d{v) = d{x) -h 1}. (1) 



That is, N^{x) is a set of x’s neighbours whose distance to a is greater than 
that of X precisely by 1. 

Let vertices be assigned level numbers according to their distances from a. 
That is, a vertex is assigned level number i if it is Tedge distant from a, indicating 
that it is ranked at level i from a. If all vertices in separator S are at level i, 
this separator is said at level i and denoted by It has been shown in [11] 
that all (minimal) a-b separators can be generated by the so-called lev el- by -lev el 
adjacent-vertex replacement as follows: 

Theorem 1. Let Li be the collection of all a-b separators at level i, where 1 < 
i <h and h < n — 3 is the maximal distance from a to any other vertex than b 
in G, and Lq = {N{a) — I{N{a))}. The union of all Li^s, contains all 

minimal a-b separators, if each separator 5*^^^ in Li is obtained by adjacent-vertex 
replacement on some x G in Li-i according to the following equation: 

sii) = (5b-i) u 7V+(x)) - U X+(x)). (2) 

^ A little more detailed analysis shows that their algorithm runs in 0{nmRab) time 
for an m-edge G. 
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Clearly from each we can generate at most new minimal a-b 

separators in next level Li. We say that separator precedes separator S^'^\ 

denoted by -< S^'^\ if is generated from by the above vertex 

replacement scheme. 

Let L_i = {a}. We can now construct a unique expansion tree by taking L_i 
as the root and elements (nodes) of Li as the nodes at level i and connecting a 
node at level i — 1 to any node in level i if -< S^'^\ 0 < i < n — 3. 

Apparently this expansion tree contains all minimal a-b separators [11]. 

If we construct the above expansion tree simply level by level, we may end 
up with a huge-size tree with many duplicate notes because every node at level 

1 — 1 produces new nodes at level i independently. We therefore need to apply 
appropriate techniques to ensure that only distinct nodes are generated at each 
level of the expansion in order to result in a minimal-size tree. We denote such 
an expansion tree by T. 

Clearly T is constructed dynamically as level-by- level expansion proceeds. 
So a critical question is how we can incrementally maintain T for each insertion 
of a new node. In [11] the AVL tree is employed for this purpose, which requires 
0(n log |T|) time. In this paper we propose a more efficient data structure to 
maintain the expansion tree that uses the binary representation of separators to 
guide fast n-way searching for duplicates. Our new data structure can insert a 
new separator with ordered nodes (with respect to indices) in 0(n) time. 

In the next section we describe this new data structure and analyze its time 
and space complexity. 

2 Maintaining the Expansion Tree 

We use an n-way bitwise ordered tree to represent the (minimal) expansion tree 
T as follows: 

Originally T contains only one node (root) Lq = = N{a) — I{N{a)) at 

level 0. When level-by-level expansion proceeds T is dynamically growing up to 
h < n — 3 levels. Each node ti at level i is an array of n — i bits representing 
the last (greatest) n — i vertices in lexicographical order in V: [0..n — i — 1], 

1 <t < \Li\. Clearly \Li\ < (the total number of ways choosing i elements 

from E— {a, b}). All edges in T are implicitly represented: An edge from [ti] 

to B^'^\j] is represented by B^'^\j] = 1 which indicates that xj appears as the 

ith element with Xf^ being its precedent in at least one separator. B^^J[j] = 0 
otherwise. A separator of 5 < n — 2 ordered elements, S = {xq, xi, . . . , Xs-i}^ is 
uniquely represented by a path from level 0 to level 5 — 1: B^^^[xo] B^J N- 
• * * ^ Bil_l\xs-i], where B^^^[xq] = 1 and Bxi_i[xi] = 1 for 1 < i < 5 — 1. 
Since S is lexicographically ordered, clearly Xi-i < Xi < for all i. Therefore 
n — i bits of a node are sufficient to represent Xi at level i in T. The algorithm 
for maintaining T works as follows: for each newly generated separator S = 
{xq, Xi, . . . , Xs-i] search T level by level from level 0; if there is an i such that 
[xi] = 0 the separator is new and the algorithm will insert it at the ith level 
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by assigning = 1; otherwise the separator is a duplicate and therefore 

should be discarded. 

Figure 1 depicts the structure of our /c-way bitwise ordered tree. 
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Fig. 1. Structure of the /c-way bitwise ordered tree T 



Note that in fact requires only n — ti bits instead oi n — i because any 

element falling into B^^ has definitely a greater rank than ti of its predecessor in 
the previous level due to the property that elements in the separator are ordered. 
This shows that we need at most half of the space allocated to each level, for 
instance, ^ rather than n? for L\. However, we use here the maximal fixed 

length for bit arrays in each level for notational and analytical simplicity. We 
will use the precise figure later in the analysis of maximal space requirement of 

r. 

We construct the expansion tree through dynamically maintaining it for node 
(separator) insertions. Our algorithm for maintaining T with respect to insertion 
of a new node S is described as follows. 

Procedure Expans ionTree(T, S) 

{*Insert ordered minimal a-b separator S = {xo,^i, . . . to expansion 

tree T. T has h < n — 2 levels: Lq, Ti, . . . , where Lq = {N{a) — 

I{N{a))} and non-zero binary array B^^J[0..n — i — 1] is a node in level Li 

connected to the tith bit (“1”) of some node [ti] in T^_i, 0 < ti < n — i^ 

l<i<h,0<s < n - 2.*} 
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i:=0; to:=0; 

while C Li)A]:i'D{B^^\xi] = l)AND(i < s — 1) do 

. — Xi^ i . — i “h 1 7 

if i < 5 — 1 then 

{*5 is new and should be inserted to Li of T.*} 
for j = i to s — 1 do [xj] := 1 

{*Build a corresponding path B^\xq] B^J[xi] ^ ^ Bilzl\xs-i] to 

represent 5, where all the edges are implicit. If ^ Lj, get a new cell 

(n bits) and insert it to ivj.*} 
else discard S 
{*5 is a duplicate.*} 
end. 

It is not difficult to see that Algorithm Expans ionTree correctly maintains 
the expansion tree T that contains all distinct minimal a-b separators inserted 
to it (proof in the long version of the paper [12]. We now analyze the time and 
space complexities of the algorithm. Time complexity of the algorithm can be 
easily seen 0{s) = 0(n). Space complexity analysis is little bit more complex. 
Due to the property of minimal separator, we know that if S and S' are two 
distinct minimal separators, then S S' and S' (/L S. Assume that a single word 
requires logn bits to store. We call the storage of |5|logn bits for separator 
S the minimum ordinary storage^ as this is needed in ordinary word-by-word 
information storage. Clearly for |5| = i there are separators corresponding 

to at most nodes at level Li in T. This indicates that the number of nodes 

in Li of T is at most (’^7^)* ^-separator to denote a minimal a-b separator 

of size i. As we are mainly interested only in terms of magnitude, for simplicity 
in notation sometimes we use n and n — 2 indistinguishably in our analysis. 

Lemma 1. Let M be the minimum ordinary storage for storing all minimal a-b 
separators. The space requirement of T is at most in the worst case 

and in the best case. 

Proof. See the long version of the paper [12]. 

Another important measure we need to work out is the maximal space re- 
quirement of T. Here we use the precise space allocation to nodes in T, that 

(i) 

is, Blf requires only n — ti bits instead oi n — i. We shall prove the following 
lemma: 

Lemma 2. The maximal space requirement of T is 0(2^) bits for a maximal 
number of 0(fl^ j \fn) minimal a-b separators, each containing elements. 
This space is effectively bits per separator on average and therefore is signif- 
icantly smaller than the ordinary storage which requires nlogn/2 bits per sepa- 
rator. 

See the long version of the paper [12]. 
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Proof. 

Lemma 2 shows an interesting property of our data structure T that it saves 
0{^/nlogn) factor of the maximum space required by the ordinary storage for 
separators. 

Now let us look at how likely the worst and best cases can occur respectively. 

Lemma 3. If the number of all minimal a-b separators is s, the worst ease spaee 
eomplexity ofT oeeurs at probability less than whereas the best ease spaee 

s-\-n 

eomplexity oeeurs at probability greater than n ^ . 

Proof. First we have all elements in each separator ordered lexicographically. The 
probability of each case can be worked out by looking at the ways of distributing 
possible elements at the ith position of each separator among all bits node by 
node at level Li and finding out the ratio of the number of ways forming the 
required distribution (worst and best) versus the number of all possible ways such 
that each node gets at least one element. It is easy to see that when the level- 
by-level distribution proceeds the probability in either case decreases because 
the above ratio for each node is always smaller than 1 and the probability is the 
product of all the ratios on the nodes to which elements are already distributed. 
Therefore the ratio of any two of them will remain about the same at Lq and all 
levels in T. Thus working out this ratio on the root at level Lq is sufficient for 
simplicity. 

The number of all possible ways of distributing all first elements in s sepa- 
rators to n bits of Lq is the number of ways of distributing values 0 to n — 1 to 
s elements with repetitions allowed: U = . 

We use #C(pi+p 2 + - • -PPi = s) to denote the number of Tpart compositions 
of integer s. Represent integer s by a straight line of length s, with left endpoint 

0 and right endpoint 5, which is divided into s segments of unit length at s — 1 
intermediate points. Selecting i points from the 5 — 1 intermediate points (0 < 

1 < 5 — 1) results in i + 1 intervals on the line which correspond to a composition 
of 5 containing i parts (elements) [10]. This simple analogy shows that 

#C(pi +P2 + ■■■ +Pi = s)= 

In the worst case all s first elements must be the same which can be any 
value between 0 and n — 1. Therefore the number of ways of distributing them 
to one bit is 

= (ffj#C{pi = s) = n(f = n. 

In the best case all s first elements must cover the whole range of values from 
0 to n — 1, which is only possible when 5 > n. In this case, the number of ways 
of distributing them to all n bits is 

E- = (ffj#C{p,+P2 + ...+Pn = s)= 
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Since s < {^ 2 ) ^ j by Stirling’s formula, the proportion that s > n 
covers the whole range of s is more than 1. So we can s imply 

regard that s covers the whole range. We approximate E~ to: 

_ (s-l)(s-2)---n 

JZf = 

[s — n)[s — n — 1) • • • 1 
(5 — 1) (5 — 2) n 

{s — n) {s — n — 1) 1 

S — 1 . s-n s-n 

> ( n) 2 > n 2 

s — n 

Thus we have that probabilities of the worst and best cases occurrence re- 
spectively: 



Pr[E+] = E+/U = (3) 

Pr[E~] = E~/U >n~^, s > n. (4) 

Prom the above lemma we know that Pr[E^] > Pr[E'^] when s > n. Now 
let us see what their relationship is when 5 < n by relaxing the the best case 
definition to that each node in T contains s “1” bits when s < n. In this case, 
^ ~ + P 2 + • • • + Ps = s) = (^). Taking the average value over 

0 < 5 < n, we have 

n-1^ \sj n- 1 ’ 

S = 1 ^ 



and 



U = 



1 

n — 1 



n— 1 



S = 1 



— n 
n — 1 ’ 



Hence 

Pr[E~]'^ {2/nY — s < n. (5) 

This shows that Pr[E~] ^ Pr[E^\ also holds for s < n. Therefore we can 
conclude that the worst case occurs always at a probability much smaller than 
the best case. 

We now find out the probability at which our T requires less space than the 
minimum ordinary storage M. Clearly T requires less space than M if every 
node in T on average carries more than n/logn “1” bits representing n/logn 
elements of some separators, because the minimum ordinary storage for n/ logn 
is n bits which is the maximum size of a node in T. Same as the analysis of the 
worst and best cases, for simplicity we only consider the first level Lq. We first 
compute the combined probability Pr[E] of Tq containing i “1” bits for i = 1 
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or 2, or n/logn. Since Lq (and each other node) must contain at least one 
“1” bit, the probability of Lq containing more than n/logn “1” bits is simply 
P[E] = 1 — Pr[E]. Let s > n/logn, which is a necessary condition to enable T 
requiring less space than M, since otherwise Lq can contain at most s < n/ logn 
“1” bits. For notational simplicity, we assume that s > n. The same way of 
analysis also applies to the case of 5 < n. 

The total number of ways of distributing s first elements to i bits in Lq for 
i = 1, 2, . . . , n/ logn is 




The probability for the above event to occur thus is 

Pr[E] = E/U < (4/n)^/(V^logn) (6) 

Consequently, the probability that Lq contains more than n/logn “1” bits, 
and hence T requires less space than M is 

Pr[E] = 1 - Pr[E] > 1 - (4/n)^/(V^logn). (7) 

Apparently Pr[E] is close to 1 as Pr[E] <C 1 and asymptotically 1 for large 
n and s. Hence we have 

Lemma 4. T requires less space than the ordinary storage N log n at an asymp- 
totic 1 probability at least 1 — {4/n)^ / {^/Enlogn), where N is the total number 
of elements in all s a-b separators drawn from [0, n — 1] and s >n. 

In the next section we show how to reduce the time complexity of other 
steps in level- by- level adjacent- vertex replacement by precomputing necessary 
data structures. Based on the n-way bitwise ordered tree, we then present an 
efficient algorithm that computes all minimal a-b separators at the cost of 0{n^) 
per separator. 
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3 Computing All a-b Separators 

Our algorithm for computing all minimal a-b separators computes separators 
level by level according to (2) starting from = N {a) — /(TV (a)), where all 
distinct separators are correctly kept (and duplicates are henceforth deleted) 
by efficiently maintaining an minimal- size expansion tree T for inserting newly 
generated separators by algorithm Expansion Tree. In order to avoid repeating 
the same computation and hence improve the efficiency of the algorithm, we 
precompute necessary common data which will be used frequently in different 
steps. Typically, we need to precompute and store and, for every x ^ V ^ 
TV+(x), Ch{N^{x)) and I{N^{x)). Clearly, the space requirement for storing 
these data is 0{n?) which is much less than that for T. 

We now present our algorithm as follows. 



Procedure (a, 5) -separators (G, a, 5, T) 

{*Generate all distinct minimal a-b separators for given non- adjacent vertices 
a and 5 in G = {V,E), \V\ = n. Input G, a and b. Output T = Li, 
where Li contains the nodes of the ith level in T.*} 

1 Compute the distance d{v) from a to every vertex x eV and label v with 
d{v); 

2 Compute the connected component Cb (containing b) of graph G[V — {a}]; 

3 for each x ^ V do 

3.1 Compute TV+(x) = {v\{x, v e E, d{v) = d{x) -h 1}; 

3.2 Compute Cb{N^{x)) = Cb — N^{x)] 

3.3 Compute I{N^{x)) = N^{x) — Cb] 

4 L_i := {S}; S := {a}; k := -1; T := 0; 

5 while {k <n — 3) A {Cb ^ 0) do 

for each 5 G do 

5.1 I{S) :=S-Cb] 

for each x ^ S that is not adjacent to b do 

5.2 Cb{N_+{x) U S) := Cb{N+{x)) - S; 
if Cb{N~^{x) U 5) ^ 0 then 

5.3 I{S U TV+(x)) := I{S) U I{N+{x)); 

5.4 5' := {S U N+{x)) - I{S U N+{x)); 

{*Generate a new separator S' for the next level 

5.5 Sort S' in lexicographical order using bucket sort; 

5.6 Expans ionTree(T, S'); 

{*Insert 5' to T: insert S' to T and also add it to if S' is new (distinct); 
discard S' otherwise (duplicate).*} 

/c := /c + 1 

end. 



The algorithm correctly computes all minimal a-b separators using level-by- 
level adjacent- vert ex replacement as stated in Theorem 1, where each separator 
is generated by Equation (2). Moreover, all distinct separators generated are 
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kept implicitly in T as paths from the root to leaves that connect the corre- 
sponding “1” bits across different levels, whereas all duplicates are discarded 
when inserting them to T by Step 5.5. 

Figure 2 shows the process of generating all minimal a-b separators of a graph 
by the above algorithm and their representation in the expansion tree T. 




Minimal a-b separators: 

{c, d, e}, {d, e, g}, {c, e, f, h}, 

{d, f, g}, {f, g, h} 

(a) Graph G and its a-b separators 
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(b) Level-by-level adjacent- vertex replacement 
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(c) Minimal expansion tree storing all distinct a-b separators of G 
Fig. 2. Computing all minimal a-b separators and their representation in T 



Let us now analyze the time complexity of the algorithm: Step 1 can be done 
in O(n^) time by using Dijkstra’s one-to-all shortest path 

algorithm to construct a shortest path tree rooted at a and then labeling all 
vertices level by level in the tree. Step 2 computing the connected component 
containing b in graph G\V — {a}] can be done 0{n^) time by first computing the 
connected components of G\V — {a}], which takes time 0(|F| + \E\) = O(n^), 
and then finding the one containing b in at most 0{n) time (there are at most 
n — 1 connected components of G[V — 7V(a)]). For each x, Steps 3.1, 3.2, and 
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3.3 each requires 0{n) time, therefore the whole Step 3 can be done in 0{ii?) 
time. Step 4 requires 0{n) time. For Step 5, the outmost loop is executed at 
most n — 2 times (|5| < n — 2), and the next two nested loops are executed 
|iv^| = Rab times, where Rab < is the total number of distinct minimal a-b 
separators maintained in T since T does not contain any duplicate separators. 
Because |5|, \Cb\ < n. Step 5.1 requires 0(n) time. Moreover, since the size of 
each of the participating sets in the computation in Steps 5.2-5. 4 is at most 
n. Steps 5.2 - 5.4 all can be done in 0(n) time. Step 5.5 requires 0(n) time by 
well-known buck sort [9], and the same for Step 5.6 by Lemma 2. So the whole 
Step 5 can be done in 0{ii?Rab)- As shown in the previous section, the maximal 
number of minimal a-b separators is bounded by 

Rab < ((„C2V2) ~ 2”^V\/^(«-2)/2 < 2”-yVn - 2. (8) 

As a result we have the following theorem. 

Theorem 2. Given an n-vertex undirected graph, all minimal a-b separators 
for non- adjacent vertices a and b can be generated in 0{n‘^Rab) time, where 
Rab < 2”- is the number of minimal a-b separators. 

Replacing the single vertex a with set A and b with B in (a, 5) -separators 
we can generalize the algorithm for the case that a and b are two non- adjacent 
vertex sets A and B in G. Noticing that N~^(A) can be obtained in 0{\A\n) time, 
and that in 0((n — |A|)^) time by first computing Cb in G[V — A] and then 
finding out which one (in case Gb consists of several connected components) 
contains all vertices of 5. In Step 5 the dominating part of (a, 5) -separators, 
clearly the outmost loop now needs to be executed only n — \A\ — \B\ times, 
and the total number of iterations of the next two nested loops is equal to the 
number of all minimal A-B separators, Rab- All steps inside the loops (5. 1-5.4) 
need 0(n) time each. Therefore we have 

Corollary 1. Let A and B be two non-adjacent subsets of V in G{V,E). We 
can generate all minimal A-B separators in 0{n{n — nA — nB)RAB) time, where 
UA = \A\, ub = \B\, n = \V\ and Rab is the number of minimal A-B separators. 

The above corollary shows an improvement of 0(n) factor over the previous 
known result for computing A-B separators [11]. 

4 Concluding Remarks 

The proposed algorithm can be easily generalized to the case that a and b are 
two non-adjacent vertex sets A and B in G. In this case for = \A\, ub = \B\, 
all minimal A-B separators can be computed in 0{n{n — ua — bb)Rab) time, 
where Rab is the number of all minimal A-B separators. 

Our algorithm can also be employed to compute all separators of a graph 
in the same way as [11] to improve the 0{n^Rj]) time which was needed for 
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computing Rjj minimal separators of the graph. However, this is not necessary 
in most cases because all separators of G can be computed more efficiently by the 
following simple algorithm: One-by-one compute (^) i-element combinations for 
i = l,2...,n — 2, check whether each separates F, and output those separating 
V. Compute and output each combination chosen from n vertices require 0(n) 
time. For each such combination 5, check whether it separates V can be done in 
time by computing the connected components of G[V — S]. So the total 
time required is 



n— 2 ^ \ 

Ts = 0{n^^rp = 0{n^2^). 

i=l ^ ^ 

Hence we have 



(9) 



Theorem 3. All separators in an n -vertex undirected graph can he computed in 
0{n?2^) time. 

By (8), since Rah < j\Jn — 2, above result shows that computing all a-b 
minimal separators is at least times more cheaper than computing all 

minimal separators. 
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Abstract. Ant Colony Optimization (ACO) is a paradigm that em- 
ploys a set of cooperating agents to solve functions or obtain good so- 
lutions for combinatorial optimization problems. It has previously been 
applied to the TSP and QAP with encouraging results that demonstrate 
its potential. In this paper, we present FF-AS-SBP, an algorithm that 
applies ACO to the ship berthing problem (SBP), a generalization of 
the dynamic storage allocation problem (DSA), which is NP-complete. 
FF-AS-SBP is compared against a randomized first-fit algorithm. Ex- 
perimental results suggest that ACO can be applied effectively to find 
good solutions for SBPs, with mean costs of solutions obtained in the 
experiment on difficult (compact) cases ranging from 0% to 17% of opti- 
mum. By distributing the agents over multiple processors, applying local 
search methods, optimizing numerical parameters and varying the basic 
algorithm, performance could be further improved. 



1 Ant Colony Optimization 

The Ant Colony Optimization (ACO) paradigm was introduced in [1], [2] and 
[3] by Dorigo, Maniezzo and Colorni. ACO has been applied effectively to the 
traveling salesman problem (TSP) [4] and the quadratic assignment problem 
(QAP) [5], among several other problems. The basic idea of ACO is inspired by 
the way ants explore their environment in search of a food source, wherein the 
basic action of each ant is: to deposit a trail of pheromone (a kind of chemical) 
on the ground as it moves, and to probabilistically prefer moving in directions 
with high concentrations of pheromone deposit. 

As an ant moves, the pheromone it leaves on the ground marks the path that 
it takes. Another ant that passes by later can detect the pheromone and decide 
to follow the trail with high probability. If it does follow the trail, it leaves its 
own pheromone on it, thus reinforcing the existing pheromone deposit. By this 
mechanism, the movement of ants along a path between the nest and the food 
reinforces the pheromone deposit on it, and this in turn encourages further traffic 
along the path. This behavior characterized by positive feedback is described as 
autocatalytic. 



P.S. Thiagarajan, R. Yap (Eds.): ASIAN’99, LNCS 1742, pp. 359-370, 1999. 
(c) Springer- Verlag Berlin Heidelberg 1999 
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On the other hand, ants may take a direction other than the one with the 
highest pheromone concentration. In this way, an ant does not always have to 
travel on the path most traveled. If an ant takes a path less traveled that deviates 
slightly from a popular path, and also happens to be better (shorter) than other 
popular paths, the pheromone it deposits encourages other ants to also take this 
new path. Since this path is shorter, the rate of pheromone deposit per ant that 
travels on it is higher, as an ant traverses a shorter distance in one trip. In this 
way, positive feedback can occur on this path and it can start to attract ants 
from other paths. 

By the interplay of these two mechanisms, better and better paths emerge as 
the exploration proceeds. For the purpose of designing an algorithm based on 
this idea drawn from nature, an analogy can be made of: 1) real ants vs. artificial 
agents, 2) ants’ spatial environment vs. space of feasible solutions, 3) goodness of 
a given path vs. objective function of a given solution, 4) desirability of taking a 
particular direction vs. desirability of making a certain decision in constructing 
the solution, 5) real pheromone at different parts of the environment vs. artificial 
pheromone for different solution choices. One of the main ideas behind AGO 
algorithms is how relatively simple agents can, without explicit communication, 
cooperate to solve a problem by indirect communication through distributed 
memory implemented as pheromone. 

In this paper, we study how AGO can be applied effectively to the ship 
berthing problem (SBP), through the FF-AS-SBP algorithm, an application of 
AGO to the SBP. The focus of this study is not on the SBP itself or on fine- 
tuning our algorithm for maximum performance. Rather, it is on demonstrating 
that AGO can be applied effectively to the SBP. In Section 2, we formally de- 
scribe the SBP. In Section 3, we describe a candidate solution representation, 
from which we adapt an indirect, first-fit (FF), solution approach in Section 4 
so that it becomes more suitable for the complex nature of the SBP. In this sec- 
tion, we also describe a randomized FF algorithm and the basis of FF-AS-SBP. 
FF-AS-SBP is described in Section 5. By naming the algorithm FF-AS-SBP, 
we acknowledge that there could be many other AGO algorithms for the SBP. 
In Section 6, we describe the experiment and report and interpret the results, 
comparing FF-AS-SBP against the randomized FF algorithm. In this section, we 
also discuss how results could be further improved, how the algorithm lends itself 
to parallelization, and possible future work. Finally, Section 7 is the conclusion. 



2 The Ship Berthing Problem 

This problem, which has been studied in [6] and [7], can be defined as follows: 
ships {S = {Si', i = 1, 2, . . . , n}) are specified to have lengths arrive at a port 
at specified times G £^nd stay at the port for specified durations di. Each ship 
that arrives is to be berthed along a wharf line of length L, i. e., it is placed 
at the interval {bi^bi + k) along the wharf line. Once berthed, its location is 
fixed for the entire duration of its stay. Also, each ship has a minimum inter- 
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ship clearance distance c| and a minimum end-berth clearance distance . Four 
types of constraints apply: 

— Ships can only be berthed within the extend of the wharf line. No part of 
any ship can extend beyond the beginning or the end of the wharf line. More 
strongly, the distance from either end of a ship to either end of the wharf 
line cannot be less than the minimum end-berth clearance distance. 

Vi G {1, 2, . . . , n} < hi < L — li — 

— No two ships can share the same space along the wharf line if the time 
intervals in which they are berthed intersect. More strongly, the end-to-end 
distance between them cannot be less than the minimum inter-ship clearance 
of either one of them. 

Vi,j G 

+ di) n + dj) ^ 0 

^ {bi - max {c|, Cj} ,bi + k + max {c|, c* }) n {bj, bj +lj) = ^ 

— A ship may be given a fixed berthing location {hi is fixed for some values of 

i). 

— A ship may be prohibited from berthing in certain intervals of the wharf 
line. More precisely, the interval bounded by the location of the two ends of 
a ship after it has been berthed cannot intersect with any of the prohibited 
interval. 



(6^, 6^ + n (p, g) = 0 if constraint applies to 5^, where 

(p, q) is some forbidden interval 



The minimization version of the problem is to determine Lq the minimum 
length of the wharf line needed to berth all the ships subject to the given con- 
straints. 

The decision version of the problem is to determine whether berthing is pos- 
sible, given a fixed value of L. 

The density D is defined as the maximum total length of berthed ships at any 
one time 



D = max 

tG( — oo,+(X)) 



E 






It is easy to see that is a tight lower bound on L. 

In this paper, we also define a measure T, which we call the fragmentation, 
defined as: 

Y.drk 

(max(ti + di) — min(t^)) D 



The berthing scenario can be visualized as a 2-D plane where the x-axis rep- 
resents time, the ^-axis represents space (along the wharf line), and each ship 



^ For convienience, this definition ignores minimum end-berth and inter-ship clearance 
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Fig. 1. A Sample Berthing Scenario 



corresponds to a rectangular block whose x-extent is the time extent and the 
^-extent is the space extent of the ship. Figure 1 provides an example of this 
setup. Given an optimum solution of cost U, F is the percentage total area of 
regions not covered by a block, within the effective duration of the problem. 

By having cf and c\ set to zero and removing the last two constraints, the SBP 
becomes the dynamic storage allocation problem (DSA), which is known to be 
NP-complete and for which there is an 5- approximation algorithm (with a cost 
upper-bound of five times optimum) [8]. Since SBP is a generalization of DSA, 
it is also NP-complete. [7] provides a reduction of the NP-complete partition 
problem to the SBP, which is another way to show that SBP is NP-complete. 

While simple to understand, a direct, geometric, representation for the SBP 
does not lend itself to an effective solution strategy. For this reason, a graph 
representation for the SBP was proposed in [7], together with an algorithm that 
uses this representation for finding good solutions. 

fn the new representation, each vertex Vi corresponds to the ship Si and has 
weight li. Two distinct vertices vi and Vj are connected by an (undirected) edge of 
weight max {c|, c|} iff -h di) and + dj) intersect. There are also zero- 

weight vertices vi and Vr^ and arcs (vi^Vi) and {vi, Vr) of weights c\ for each vertex 
Vi that corresponds to a ship Si. The constraints related to prohibited and fixed 
berthing positions can be represented using a combination of auxilliary vertices 
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• • •} acting as imaginary ships, auxilliary edges and auxilliary arcs. 
Details of this representation can be found in the original paper [7]. 

A feasible solution is any DAG, G, resulting from setting the direction of all 
edges, i. e., converting each edge to an arc of either direction. The cost of the 
solution, cost(G), is equal to both the length of the longest path in it and the 
value of L corresponding to that solution. The objective, therefore, is to find a 
DAG with as short a longest path as possible. 

3 Solution as a Vertex Permutation 

Not all edge direction assignments lead to a feasible solution as some lead to 
circuits, which do not exist in a DAG. Rather than searching all 2 1^1 possible 
digraphs, many of which may not be DAGs, we can map each possible DAG to a 
vertex permutation and perform the search over the space of possible permuta- 
tions. Each permutation 7 t= ^ ^ = (7ri7T2 . . . tt^) maps each vertex i to 

an integer label 2, ..., n}, where n=|E|. An edge (it, v) is set to arc (i/, v) iff 

and (v^u) iff 7Vy<7Tu. In this way, the digraph induced by a permutation 
is always a DAG, and each DAG has at least one corresponding permutation. A 
given permutation can always be normalized w. r. t. a given graph by first com- 
puting the DAG it induces and then performing a deterministic DAG labeling 
to obtain the normalized permutation. 

In this paper, we use the terms ‘labeling’ and ‘position’ both to mean ‘permu- 
tation’. ‘Position’ carries the meaning that vertices can be ordered in a sequence 
s. t. the first vertex in the sequence has label 1, the second vertex in the sequence 
has label 2 and so on, and the position of a vertex, which is equivalent to its 
labeling, is its position in that sequence. 

The permutation representation or solutions seems to lend itself to a direct 
solution strategy similar to that used in [5], where the solution is also represented 
as a permutation. However, there is one important difference that makes the 
permutation representation problematic for the SBP. In the QAP, individual 
labels contribute piecewise additively to the final objective function. In the SBP, 
individual labels alone do not determine the cost of the longest path. This value 
is a function of the collective interaction between the vertex positions within the 
graph and there is no known straightforward relationship between the cost and 
the labeling over a subset of the vertices — over all possible DAGs, the cost of 
the longest path is not predicted or determined by individual vertex labels, or 
even arc directions, as the following two examples illustrate: The mere flip of a 
single arc can drastically change the cost; the reverse of a DAG G (all arcs are 
flipped; is swapped with has the same cost as G. 

Therefore, both the graph representation and the permutation representation 
seem unlikely to support AGO algorithms. For this reason, we adapted from the 
permutation representation a more indirect solution approach, which is explained 
in the next section. 
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4 First-Fit Approaches 

In the standard first-fit algorithm (FF), ships are berthed (packed) one at a time 
in some predetermined order. When a ship Si is to be packed, it is position as 
near to the front as possible without overlap with other ships Sj which have been 
berthed and whose time intervals + dj) intersect with the time interval of 

S'z, {U^ti + di). Of course, the packing of each ship is also done so that none of 
the other problem constraints are violated. 

This can visualized on the geometric representation as positioning each block 
(ship), one at a time, with the ^-coordinate fixed, minimizing the ^-coordinate 
while not letting the current block overlap (or come too close) with over blocks 
which have been packed, or any other constraint to be violated. 

When all the ships have been packed, the cost of the solution can be easily 
determined from the ^-coordinate, length and minimum end-berth clearance of 
each ship. 

The input the the FF algorithm, therefore, are the SBP problem itself and the 
order in which to pack the ships, represented as a permutation tt, where ship i is 
the ith ship to be packed. It should be clear that for a given SBP problem, some 
permutations yield better solutions than others. In fact, it can be shown that 
there always exists a permutation that yields an optimum solution. However, for 
a given problem, it is difficult to perform a quick analysis to obtain an optimum, 
or even a good permutation. A straightforward approach, therefore, is to actually 
try different permutations. 



4.1 Randomized First-Fit Algorithm 

This algorithm simply generates a random permutation and calls FF with it. This 
is repeated for as many times as required to obtain better and better solutions. 
Improvements become increasing rare as the cost the the best discovered solution 
approaches the optimum. 



4.2 ACO-First-Fit Algorithm 

Using an AGO approach, we would like to obtain good permutations for doing 
FF. 

Each permutation can be assigned a cost, which is the cost of the solution 
obtained by calling FF with it. 

Unlike in DAG labeling, permutations could be related to cost in a straight- 
forward way. In other words, for a given problem, there could be some easily 
represented, hidden, rules by which FF packing could be ordered so as to obtain 
a reasonable solution. For example, the complex nature of SBP seems not as 
strong in this representation, as can be seen in that in general, a permutation 
does not have the same cost as its reverse, and that displacing one ship in the 
packing sequence does not drastically change the cost, as are the case in DAG 
labeling. 
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Intuitively, it is the relative order in which packing is done that affects the 
cost, rather than the absolute position of each ship in the packing sequence — we 
are concerned with whether “5^ is packed before rather than whether “5^ 
is the jth ship to be packed”. To support this mode of representation, given a 
permutation tt of length n, we define a n x n boolean, lower-triangular matrix 
(only entries strictly below the diagonal are defined): 

R = C . . = / t 

y false if TTi > TVj J 

It can be easily seen that there is a one-to-one function that maps tt to R, and 
that the reverse is not true, i. e., some values of R are invalid. A permutation tt 
can be constructed from constraints encoded in (a valid) R as follows: 

1. Set 7Ti = 1 

2. Determine (from the 2nd row of R) whether 7T2 < tti (then 7T2 = 1 and tti 
should be shifted right, i. e., set to 2) or 7T2 > tti (then 7T2 = 2). 

3. Determine (from the 3rd row of R) whether tts = 1, 2 or 3 and set it to that 
value. Vi 7 ^ 3 s. t. > tts, increment by 1. 

4. In the same fashion determine and update for i G {4, . . . , n}. 

This above definition provides the basis for the pheromone design of FF-AS-SBP. 



5 Ant Colony Optimization for the SBP 

A vital factor in the effectiveness of any AGO algorithm is the design of the 
pheromone trail data structure — the information contained therein and how it 
is used. The FF-AS-SBP algorithm presented below uses the design we have 
experimentally found to give the best performance. 

The algorithms in this section are not presented in a computationally opti- 
mized form, or the form in which we implemented them, but only to describe 
their computational effects. 

The FF-AS-SBP algorithm takes as input the problem in the graph represen- 
tation and the density D of the problem, which is used as a lower bound. It 
stores solutions as FF permutations and returns the cost of the best permuta- 
tion found. The cost, of a permutation tt is defined as the cost of the packing 
obtained from performing FF with tt. The parameters to the algorithm are o;, 
TV, iCand 7. a and 7 are real numbers in the interval [0, 1). N and K are positive 
integers. 

The non-scalar variables used are B and T. is a set of permutations — 
the set of the best K solutions discovered. T = (r^j) is a n x n matrix of real 
values each representing the intensity of a pheromone trail. Tij represents the 
desirability of setting < tt^, in the spirit of matrix R of Section 4.2 (unlike in 
R, the matrix is not lower- triangular but diagonal elements are irrelevant). 
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FF-AS-SBP Algorithm 

1. Let To = {nD)~^ . Let Lmin denote minTreH^Tr- 

2. 8^0; T ^ To 

3. Populate B with a random permutation for 2K times 

4. Delete from B all but the best K permutations 

5. T ^ (1 - a)T 

6. For each tt G update the pheromone matrix T according to the Global 
Update Algorithm using a reinforcement value of L~^ 

7. For each ant a = 1,2, ...,7V, build a solution permutation according to 
the Ant Solution Algorithm and add the solution to B while keeping 
\B\ < iV by deleting worst permutations as necessary 

8. If Lmin > D and neither the iteration limit nor time limit has been exceeded 
then goto 5 

9. Return Lmin 

Steps 1-4 perform initialization. Steps 5-6 perform global pheromone update 
using the K best solutions. Step 7 performs ant exploration and local pheromone 
update. 

We now explain the role of each input parameter, o; represents the rate of 
pheromone evaporation and determines conversely, the persistence of memory 
from earlier iterations. TV is the number of ants used in step 7 for constructing 
ant solutions. K is the number of globally best solutions to remember for the 
purpose of the global update in step 6. 7 represents the preference for exploita- 
tion over exploration when ants construct their solutions in the Ant Solution 
Algorithm. 

We now describe the sub-algorithms in detail. 

Global Update Algorithm 

This simple algorithm reinforces pheromone trails associated with a given 
solution 7T by a given reinforcement value A. 

For each u e {1, 2, . . . , n — 1} 

For each G {r + 1 , . . . , n} 

If '^u ^ 

^UV ^ ^UV A 

else 

^vu ^ Tyu T A 

Ant Solution Algorithm 

This is the algorithm that each ant performs in building a solution tt. The solu- 
tion (permutation) is incrementally constructed by setting in order, starting 
on 7Ti and ending on similar to the method described in Section 4.2. The 
difference is that at each step, instead of being determined by a boolean matrix 
of constraints, each is set to a value probabilistically chosen from all possible 
values based on their desirability. 

At each step, the desirability measure of each candidate value, a function 
of different pheromone strengths, is evaluated and either exploitation or explo- 
ration is chosen probabilistically. Exploitation is chosen with probability 7. In 
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exploitation, the candidate value with the highest desirability is chosen. In ex- 
ploration, these values are probabilistically chosen based on their desirability. 
We now define the desirability measure dp of a given value p when choosing a 
value for tTc. 
dp is defined as 

n r Tyc if TTy <P 

1 Tcv otherwise 

With d computed, exploitation chooses the value with highest d, while explo- 
ration chooses each candidate value p with probability proportionate to dp. After 
a value p has been chosen and tt has been updated accordingly, all pheromone 
trails Tij associated with the p, i. e., that appear as terms in the product in the 
above equation, are diminished: 

Tij ^ (1 - Oi)Tij + aro 

The above is known as the local (pheromone) update. 

6 Experiment 

In this section we discuss the design, execution and results of our experiment to 
study the performance of the FF-AS-SBP algorithm, as well as our interpretation 
of the results and possible future work. 

For simplicity in the design of the experiments, we considered only problem 
instances with zero edge weights. Also, the constraints related to clearance dis- 
tances and fixed and forbidden positions are absent from the test cases. In other 
words, only problems that are also DSA problems are used. Nevertheless, the 
results should extend to the general case. 

While we have not mapped out the characteristics of the parameter space as it 
is too vast, we have empirically found that setting o; = 0.1, N = K = [0.1m] 
(m = |£^|) and 7 = 0.9 seems to yield reasonably good solutions. This is the 
parameter setting adopted in all the test runs recorded in this section. 

Each of the 6 test cases consist of one connected component-i. e., there is no 
way to cut the geometric representation into two parts with a vertical line with- 
out also cutting a block. This is because a problem of more than one component 
can be divided into these components to be solved individually. 

The test cases were generated by a block cutting algorithm that generates 
blocks of varying height and width from an initial rectangular block, which is 
recursively divided into smaller and smaller blocks (some intermediate blocks 
are L-shaped blocks). Hence, all the problems generated are compact, i. e., have 
zero fragmentation, a property that is expected to make them difficult to solve 
optimally. 

The randomized FF algorithm and FF-AS-SBP are run 10 times on each case, 
and each run is given 3|E| seconds to live. For each algorithm applied to each 
of the 6 test cases, the costs from each of the 10 runs are reported in ascending 
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Table 1. Randomized First-Fit Algorithm Results 



Case 


D 


IV^I 


\E\ 


Runs 


1 


80 


20 


119 


80 80 80 80 80 80 80 80 80 80 


2 


120 


30 


338 


120 120 120 121 121 121 125 125 126 126 


3 


160 


40 


457 


175 186 187 187 188 189 190 190 190 193 


4 


200 


50 


633 


245 246 247 248 249 251 251 252 254 254 


5 


240 


60 


783 


298 300 300 301 301 305 306 307 308 312 


6 


320 


80 


1641 


416 417 417 419 420 422 424 425 425 427 



Table 2. FF-AS-SBP Results 



Case 


D 


1^1 


\E\ 


Runs 


1 


80 


20 


119 


80 80 80 80 80 80 80 80 80 80 


2 


120 


30 


338 


120 120 120 120 120 120 120 120 121 121 


3 


160 


40 


457 


173 173 175 176 176 177 177 182 182 184 


4 


200 


50 


633 


205 225 231 232 233 235 238 239 244 247 


5 


240 


60 


783 


254 254 255 258 258 258 258 258 258 258 


6 


320 


80 


1641 


368 369 369 369 370 375 376 379 384 387 



order. These results are shown in Table 1 and Table 2 and summarized in Table 

3. 

The randomized FF algorithm acts as a control for checking whether phero- 
mone information does make a real difference in the quality of obtained solutions. 
For this reason, time-to-live of the runs is expressed in terms of actual time taken 
rather than number of iterations. 

FF-AS-SBP performed consistently and significantly better than randomized 
FF, especially for larger cases, except in case 1, where both algorithms always 
returned an optimum solution. 

Our interpretation of the experiment results is that pheromone information 
did help to improve the quality of the solution when applied to an FF-based 
approach to the SBP. We also note that it was necessary for us to adopt this 
indirect approach to avoid building an algorithm that would futilely explore the 
rough terrain of the more direct approach based on DAG labeling. 



Table 3. Experiment Summary 



Case 


D 


IV^I 


\E\ 


Randomized FF 
Min Mean Max 


FF-AS-SBP 
Min Mean Max 


1 


80 


20 


119 


80 80 


80 


80 80 


80 


2 


120 


30 


338 


120 122.5 


126 


120 120.2 


121 


3 


160 


40 


457 


175 187.5 


193 


173 177.5 


184 


4 


200 


50 


633 


245 249.7 


254 


205 232.9 


247 


5 


240 


60 


783 


298 303.8 


312 


254 256.9 


258 


6 


320 


80 


1641 


416 421.2 


427 


368 374.6 


387 
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In view of the current results, the following areas merit further investigation: 

— Heuristic measure for (partial) FF permutations while they are being con- 
structed. An obvious choice is the cost of the partial solution. 

— Alternative pheromone matrix design. 

— The full characteristics of the parameter space in affecting the dynamics of 
the ant exploration and pheromone matrix. This could provide insight on 
how to optimize convergence and quality of solution, by setting parameters 
appropriately, or even dynamically varying them during the search through 
the use of appropriately designed meta-heuristics. 

— Employing heterogeneous ants in the search. Ants could differ by quanti- 
tative parameters or qualitatively by the search strategy they use. The di- 
versity introduced by having heterogeneous ants could contribute to overall 
algorithm performance. If investigation demonstrates that employing het- 
erogeneous ants is indeed a profitable approach, the result could extend to 
other fields of distributed computing and lead to interesting investigation on 
the use of heterogeneous agents in those fields as well. 

— How performance can be improved by increasing the number of ants. If in- 
creasing the number of ants can significantly improve performance, then 
there exists the possibility of distributing the work of many ants to multi- 
ple processors to achieve significant speedup, since individual ants operate 
independently of one another. 

— How local search techniques can be combined with the basic AGO technique 
to improve performance. An example of such a technique is the famous tabu 
search method [9]. 

7 Conclusions 

In this paper, we have formulated a pheromone information design and coupled 
it with the FF algorithm to produce an AGO algorithm that obtains reasonably 
good solutions for the SBP, thus demonstrating that the AGO paradigm can be 
applied effectively to the SBP. Morever, we have demonstrated that pheromone 
structure does not need to be directly related to the physical structure with 
which a target problem naturally is expressed. 

The major ingredients of our algorithm, FF-AS-SBP, are the FF algorithm, 
permutation construction and the pheromone information that supports this 
construction. FF-AS-SBP compares favorably against the randomized FF algo- 
rithm. 

We have also proposed a few key areas for further investigation to obtain 
even better performance and deeper insight. Of special interest is the possibility 
of using heterogeneous agents in FF-AS-SBP and in distributed computing in 
general. 
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There is no well-know model for mobile agent security. One of the few attempts so 
far is given by [1]. The model is, however, a qualitative model that does not have 
direct numerical measures. It would be great if there is a quantitative model that can 
give user an intuitive sense of "how secure an agent is". 

Software reliability modeling is a successful attempt to give quantitative measures of 
software systems. In the broadest sense, security is one of the aspects of reliability. A 
system is likely to be more reliable if it is more secure. One of the pioneering efforts 
to integrate security and reliability is [2]. In this paper, these similarities between 
security and reliability were observed. 



Security 


Reliability 


Vulnerabilities 


Faults 


Breach 


Failure 


Fail upon attack effort spent 


Fail upon usage time elapsed 



Fig. 1. Analogy between Reliability and Seeurity 



Thus, we have security function, effort to next breach distribution, and security 
hazard rate like the reliability function, time to next failure distribution, and 
reliability hazard rate respectively as in reliability theory. One of the works to fit 
system security into a mathematical model is [3], which presents an experiment to 
model the attacker behavior. The results show that during the "standard attack phase", 
assuming breaches are independent and stochastically identical, the period of working 
time of a single attacker between successive breaches is found to be exponentially 
distributed. 




Fig. 2. A Mobile Agent Travelling on a Network 



Now, let us consider a mobile agent travelling through n hosts on the network, as 
illustrated in Figure 2. Each host, and the agent itself, is modeled as an abstract 
machine as in [1]. We consider only the standard attack phase described in [3] by 
malicious hosts. On arrival at a malicious host, the mobile agent is subject to an attack 
effort from the host. Because the host is modeled as a machine, it is reasonable to 
estimate the attack effort by the number of instructions for the attack to carry out, 
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which would be linearly increasing with time. On arrival at a non-malicious host, the 
effort would be constant zero. Let the agent arrive at host i at time T, for i = 1,2, ..., 
n. Then the effort of host i at total time t would be described by the time-to-effort 
function: 

Eft) = kft-T) , where k is a constant 

We may call the constant k the coefficient of malice. The larger the the more 
malicious host z is (C. = 0 if host z is non-malicious). Furthermore, let the agent stay on 
host z for an amount of time then there would be breach to the agent if and only if 
the following breach condition holds: 

EftfT) > effort to next breach by host z 
i.e., kf > effort to next breach by host z 

As seen from [32], it is reasonable to assume exponential distribution of the effort to 
next breach, so we have the probability of breach at host z, 

P(breach at host z) = P(breach at time tfT) 

= P(breach at effort kt) 

= 1 - exp(-vkt) ,v is a constant 

= 1 - exp (-ft) ,/L. = vk. 

We may call v the coefficient of vulnerability of the agent. The higher the v, the higher 
is the probability of breach to the agent. Therefore, the agent security E would be the 
probability of no breach at all hosts, i.e., 

E e ' 

i=\ 

Suppose that we can estimate the coefficients of malice k)s for hosts based on trust 
records of hosts, and also estimate the coefficient of vulnerability v of the agent based 
on testing and experiments, then we can calculate the desired time limits T3s to 
achieve a certain level of security E. Conversely, if users specify some task must be 
carried out on a particular host for a fixed period of time, we can calculate the agent 
security E for the users based on the coefficients of malice and vulnerability 
estimates. 
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1 Introduction 

Agent based technologies in the sense of distributed computing becomes increasingly 
relevant in academic and industrial research and development. Our multi agent sys- 
tem CASA focuses on the specification of complex plans that describe the behavior 
of agents. The design of CASA is based on concepts from concurrent logic program- 
ming that were used to extend the well known BDI agent approach AgentSpeak(L) [2]. 
Rao’s AgentSpeak(L) is a specification language similar to horn clauses and can be 
viewed as an abstraction of an implemented BDI system. Our work is based on the as- 
sumption that AgentSpeak(L) demonstrates a successful reengineering approach of an 
implemented multi agent system that is now given a formal specification. By extend- 
ing this specification with new features for complex plans it was possible to derive an 
efficient implementation that supports these new features. 

2 Design of the CASA Specification Language 

Specification: CASA agents perceive events/messages and select relevant plans with 
individual weights for handling these events/messages. Plans are modeled as clauses 
with additional guard predicates for testing the applicability of the clause {applicable 
plans). Such guard predicates may be arbitrary complex, i.e., the reduction of a guard 
may include the evaluation of another plan (deep guards). Such complex speculative 
computations may also include communicative acts with other agents. Based on the 
contextual conditions in guard predicates different plan types can be distinguished: re- 
active plans only have simple tests as guards, deliberative plans allow speculative com- 
putations within an agent and communicative plans allow to communicate with other 
agents. CASA uses a hierarchy for plan selection (reactive > deliberative > commu- 
nicative). An agent can execute several plans at a time and elements of a single plan 
can be processed sequentially or in parallel. Additionally, plans can be suspended by 
other plans and have a special exception section if an applicable executed plan fails. 
The features of a CASA agent are formally defined based on extended guarded horn 
clauses and the cycle of operation is specified by an abstract interpreter. For ease of use 
we developed a simple textual format that allows an efficient modeling. 

Modeling: The textual CASA definition format defines the initial agent state and is 
divided into four sections: First, functions for selecting events, plans, and intentions 
at run time execution have to be declared. Initial goals of the agent are declared in the 
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second section. Each of these goals will be instantiated together with an applicable plan. 
For parallel plan execution a multistack data structure is used to handle the instantiated 
plans (named as intentions) as separate entities. Initial facts and plans are listed in the 
third and fourth section. 




Handling speculative computations is one of the major aspects of the abstract CASA 
interpreter. Speculative computations appear in the context test for relevant plans when- 
ever guards of deliberative or communicative plans have to be checked. Two indepen- 
dent additional components are introduced in order to manage speculative computa- 
tions: the DelibStructure (resp. CommStructure) is holding elements composed of a 
goal and all relevant deliberative (resp. communicative) plans to check. For each of 
these elements a new instance of the interpreter is generated and executed in parallel to 
the other cycles. The operational semantics of the CASA interpreter are best described 
by means of the interpreter cycle shown in Figure 1. 

Implementation: CASA is implemented with JDK 1.1.8, integrating modules of M. 
Huber’s JAM library. A parser written in JavaCC reads the textual CASA specification, 
sets the initial state of CASA agents and starts the execution on the CASA interpreter. 
CASA agents are integrated into the MECCA framework, an agent management system 
that implements the FIFA ACL standard. 

Validation: As a first case study we presented a simple application taken from holonic 
manufacturing [1]. Future work will concentrate on the development of visual tools for 
the design of CASA agents and the application in the area of flexible manufacturing 
systems and intelligent user interfaces. 
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1 Introduction 

In PCS (personal communication systems) networks, location management deals 
with the problem of tracking down a mobile user. It involves two kinds of activ- 
ities: one is location updating and the other is paging [1]. Each cost for location 
updating and paging is significantly associated with each other and therefore, has 
a trade-off relationship to total signaling cost. Location management strategies, 
as implemented in current PCS networks and as presented in this literature, are 
based on spatially static location areas that are built by heuristics or aggregate 
statistics [1,2]. The static location areas are used identically for all mobile users, 
even though the mobility pattern and call arrival rate of each mobile user are 
different spatially and temporally. Thus, the location management applied to 
same location areas for all mobile users suffers from the various mobile proper- 
ties of users. Consequently, these strategies can not efficiently reduce the total 
signaling cost. 

In this paper, we propose a location area partitioning strategy that can par- 
tition the whole cells into optimal location area. Our strategy considers per-user 
mobility pattern and call arrival rates on the zone-based approach. Also, the 
paging and location updating cost is simultaneously considered in determining 
the location area due to the intimate correlation on the total signaling cost. 

2 Proposed Strategy 

The proposed strategy consists of the following three steps: 

1) Collect individual mobile user’s mobility pattern and call arrival 
rate. Update user profile for every day or every week: We make an 
attempt to determine the location areas on per-user basis, personal location 
area. The information to create the personal location areas is derived from the 
user profile, P that contains a record of previous transitions from cell to cell 
and the number of calls requested in given time interval. P is updated for every 
or week, using the information accumulated whenever the user responds to a 
paging message, originates a call, or sends a update message. 

2) Create the mobility graph from user profile: In order to represent and 
maintain the use profile efficiently, we introduce mobility graphs G = (V,E), 
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where V is set of vertices and E is the set of edges. In G, each vertex has 
the weight that represents the average call arrival rate. The weight assigned 
to each vertex is used as the paging cost. The edge between two vertices has 
the weight which represent the average updating rate when a user moves across 
cells. The weight assigned to the edge is used as location updating cost. The 
total signaling cost per mobile user based on the mobility graph is following as: 
CriTTi^fia^ Um) = m ' fj^a ' Cp E Um * where m is the number of cells in a 
location area, Ua is the call arrival rate, and Um is the location updating rate. 
Cp and Cu are the paging cost per cell and location updating cost, respectively. 
3) Partition the vertices of the mobility graph into optimal location 
areas by genetic algorithms: We consider the determination of location areas 
as a class of partitioning problems: a partition of Ny vertices into Nla location 
areas. Since location area partitioning problem is a well-known NP-complete 
problem, an exact search for optimal location areas in given time is impractical. 
Genetic algorithms is used for finding near-optimum location location areas. 
In mobility graph, the active vertex represents the cells which a user visited 
a lease once or in which a user received more than one incoming call. Only 
active vertices are encoded into individuals. Structural crossover and structural 
mutation operator are used as genetic operators. 

In our simulation, we use mobility data collected for ten days from a mobile 
user in a hexagonal cell environment. We assume that the mobile user moves 
through six intervals, in which different mobility pattern (velocity and direction) 
and call arrival rate are used. After 1000 generations, the produced number of 
location areas is 7, 5, 4, and 3 when 1 /y (the ratio of G^ to Gp) is 2, 3, 5, and 
10, respectively. It is shown that our strategy has the capability of optimizing 
the number of location areas as well as the location area size according to the 
mobility pattern and call arrival rate of a mobile host. Also, our strategy has 
smaller cost than zone-based strategy with the fixed location area, regardless of 
the variance of the 1 /y. 

3 Conclusions 

In this paper, we proposed a location area partitioning strategy using genetic 
algorithms for mobile location tracking under the fact that the size and shape 
of location areas affect the total signaling cost. 
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Most functional languages such as ML and Haskell, and many object-oriented 
languages such as Smalltalk and Java use garbage collection to reclaim mem- 
ories automatically. Since programmers are freed from worry about memory 
management, they can enjoy more safety and productivity than in conventional 
languages such as C. On the other hand, garbage collection causes several diffi- 
culties in real-time systems and very small systems, since it is difficult for us to 
predict execution time of particular parts of programs and since garbage collec- 
tion requires certain amount of memories in order to be performed efficiently. 

Tofte and Talpin’s region and effect system for ML [1] is an alternative to run- 
time garbage collection. It infers lifetime of objects at compile time and makes it 
possible to reclaim memories safely without using garbage collectors at run time. 
Then, objects can be allocated in a stack (to be exact, in a stack of regions). 
The system uses an ordinary ML program as input, and annotates it with region 
operators such as letregion and at. Programmers do not have to provide any 
explicit directives about lifetime of objects. It is sometimes necessary, however, 
for programmers to transform programs specially in order to make the system 
infer as they intend. Therefore, programmers have to know, to some extent, the 
region inference rules. 

However, rather than a system which infers all about regions automatically, 
sometimes it might be convenient to have a semi-automatic system which lets 
programmers delimit lifetime of objects directly and checks safety of such di- 
rectives. This is what we want to propose here. In principle, such a direct style 
would be possible in Tofte and Talpin’s system by allowing programmers to use 
region annotations explicitly. In practice, however, it would be difficult to do so 
since region annotations tend to be lengthy and complex. Therefore, we want to 
make region annotations simple and readable. 

Our proposal is also based on Launchbury and Peyton Jones’s proposal of 
runST [2] for Haskell, which encapsulate stateful eomputations in pure compu- 
tations. “State” in Launchbury and Peyton Jones’s terminology roughly corre- 
sponds to “region” in Tofte and Talpin’s. Though Haskell is a lazy language and 
therefore memory management is not a target of their research, it seems pos- 
sible to apply their proposal to memory management in eager languages such 
as ML. In fact, Launchbury and Peyton Jones’s runST and Tofte and Talpin’s 
letregion seem to be closely related. They both use polymorphism to delimit 
regions and effects. In Launchbury and Peyton Jones’s proposal, by using the 
monad of state transformers to manipulate state, state is not explicitly handled 
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in programs, while types show information about state and the extent of state 
is given explicitly by the operator runST. 

In our system, in most cases, programmers do not have to specify regions. 
Then, the current default region which is passed around implicitly is used. How- 
ever, sometimes we want to access older regions instead of the current region. In 
such cases, programmers should be able to explicitly specify the region. Com- 
positional references^ which the author proposed in [3] in order to manipulate 
general mutable data structures in Haskell, are used for this purpose. Our region 
actually corresponds to a set of regions of Tofte and Talpin’s. A newer region 
contains older regions and there is a kind of a subtyping relation between old 
and new regions. This fact is helpful to keep type expressions simple. When our 
region primitive named extendST allocates a new region, it passes to the inner 
expression a reference to the old default region. This is contrary to let region, 
which passes to the expression a reference to the newly created region. 

One of the important points of Tofte and Talpin’s system is region inference 
of recursive functions [4] . Recursive functions must be region polymorphic in or- 
der for the system to be practical. In general, however, ML-style type inference 
algorithm cannot infer the most general type for polymorphic recursion. Tofte 
and Talpin’s system treats region parameters in a special way to deal with poly- 
morphic recursion and uses iteration to find an appropriate region assignment. 
We will present an alternative way of inferring regions of recursive functions, 
which does not use iteration. 

Acknowledgment 

This research is partially supported by Japan Society for the Promotion of Sci- 
ence, Grant-in- Aid for Encouragement of Young Scientists, 10780196, 1999. 



References 

1. Mads Tofte and Jean-Pier re Talpin. Implementation of Typed Call- by- Value A- 
calculus using a Stack of Regions. In 21st ACM Symposium on Principles of Pro- 
gramming Languages, pages 188-201, January 1994. 

2. John Launchbury and Simon L. Peyton Jones. State in Haskell. Lisp and Symbolic 
Computation, 8(4): 293-341, 1995 

3. Koji Kagawa. Compositional References for Stateful Functional Programming. Proc. 
of the International Conference on Functional Programming 1997. June 1997. 

4. Mads Tofte and Lars Birkedal. A Region Inference Algorithm. ACM Transactions 
on Programming Languages and Systems, 20(5):724-767, July 1998. 




A Verification Technique Based on Syntactic 
Action Refinement in a TCSP-like Process 
Algebra and the Hennessy- Milner- Logic 



Mila Majster-Cederbaum and Frank Salger 

Universitat Mannheim 
Fakultat fiir Mathematik und Informatik, 

D7, 27, 68131 Mannheim, Germany 
{mcb, f salger }(9pi2. informatik.uni-mannheim.de 



Abstract. We investigate the conceptual link between action refine- 
ment for a process algebra which contains the TCSP-parallel compo- 
sition operator and recursion on the one hand and action refinement 
for the Hennessy-Milner-Logic on the other. We show that the assertion 
P 1= ip P[a Q] N ^ Q] where • [a ^ Q] denotes the refinement 
operator both on process terms and formulas holds in the considered 
framework under weak and reasonable restrictions. 



TCSP (see e.g. [BHR84]) can be used to develop reactive systems in a formal 
way. This process calculus provides operators which are used to compose process 
expressions to more complex systems. The basic building blocks in such calculi 
are uninterpreted atomic actions which can be seen as the conceptual entities 
at a chosen level of abstraction. The concept of (syntactic) action refinement 
supplies the possibility to provide a more concrete structure for an action in 
a later system design step: Given a process expression P one refines an atomic 
action a of by a more complex process expression obtaining a more detailed 
process expression P[a'-^ Q]. The Hennessy-Milner-Logic HML [Mil80] provides 
the ability to formalize properties of processes. 

Whereas the interplay between action refinement and various notions of 
bisimulation is well understood (see e.g. [AH91, GGR94]), little attention has 
been given so far to the interplay of action refinement and specification logics. 
In particular knowing P \= p does not provide information which formulas are 
satisfied by P[a Q] or vice versa. Such knowledge however, could be used to 
facilitate the task of verification in a stepwise refinement of systems. We enrich 
Lf ML by an operator for action refinement and provide the link between syntac- 
tic action refinement for TCSP (without hiding and internal nondeterminism) 
and action refinement in HML by showing that the assertion 

P \= ip ^ P[a Q] 1= ip[a Q] (1) 

holds under reasonable and weak restrictions. 

Assertion (1) can be interpreted in various ways. Firstly we may interpret 
(1) as a simplification of the verification task, as we may check P \= ip instead 
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of F[a Q] 1= (/?[a Q]. Secondly we may read (1) from left to right, which 

embodies two views of verification: Firstly we can focus on the refinement of 
the specification (f. Given P ^ cp, a refinement -[a Q] on (f automatically 
supplies a refined process term which satisfies the specification (p[o; Q]. This 

can be seen as a concept of ‘a-priori’-verification. Secondly we can focus on the 
refinement of process terms, i.e. we consider the refinement F[a Q] of P, 
where F \= ip and obtain automatically F[a Q] |= (p[a Q]. One might 
argue that we could determine F[a Q]? search a specification t/; which reflects 
the refinement -[o; Q] in the logic and then apply a model checker to establish 
P[a Q] h However no hint is available which formula t/; reflects the 
refinement of P in an appropriate way. Hence we would have to test a sequence 
of formulas (of which we think that they reflect the refinement) • • • , 

with the model checker where in general a application of the model checker might 
need exponential time. The formula lZed{ip[a Q]) which is obtained from 
(p[o; Q] by removing all occurrences of the refinement operator via reduction 
may be of size exponential in the size of Q. However only one such exponential 
reduction invoked by the application of assertion (1) is necessary to obtain a 
formula which reflects the refinement appropriatly. Hence (1) can be used as a 
more efficient indicator of the logical consequences induced by refinements on 
process terms than model checking. The validity of (1) has another interesting 
implication: The assertion can be used by a system designer, who is not interested 
in full equality of refined process expressions modulo bisimulation equivalence, 
but in the fact, that two refined processes both satisfy a refined property (or a 
set of refined properties) of special interest. A preliminary (full) version of this 
extended abstract can be found in [MCS99a]. Work is in progress ([MCS99b]) 
which extends our approach to the modal mu-calculus of [Koz83]. 
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1 Introduction 

Different cryptographic protocols have been formally proved by different re- 
searchers. But in the case of secret sharing (SS) protocol there is no attempt 
yet of proving. We show with an example of SS modeling, how SS protocol can 
be formally verified using Coq, a general theorem prover. In modeling our SS 
protocol we follow the approach of Dominique. The approach is based on the 
use of state-based general purpose formal methods , and on a clear separation 
between the modeling of reliable agents and that of intruders. The formaliza- 
tion for the intruder knowledge, axioms for manipulating them, as well as the 
protocol description can be transposed quite directly using Coq. 

Secret sharing protocol The secret share distribution protocol can be 
described in the following steps: 

i) Share preparation : Dealer prepares share of the given secret and also 
prepares auxiliary shares i.e. o;^ = f{i) and pi = r{i) where f and r are two 
random polynomial whose /(O) = s i.e. secret; ii) Distribution of share: Dealer 
distributes each share to the Players via private channel, i.e. secretly (DistS); 
iii) Broadcast of commitment: Dealer broadcasts commitment by one-way hash of 
all shares and auxiliary shares (Beast); iv) Broadcast of verification: Each player 
compares his share and commitment and broadcasts result (i.e. either true or 
false) (Comt); v) If any verification result is false then: Dealer broadcasts secret 
share, auxiliary share and hash value of that player (BcastD). 

Modeling of intruder knowledge The data that can be communicated be- 
tween principals are basic data, data obtained by composition and data obtained 
by combining valid number of shares. 

The domain of secret shares is noted by SH. If the domain of symmetric key is 
noted iF5, while that asymmetric keys is noted by KA^ then K is thus the union 
of K A and KS. The pair operator is used to represent reversible constructors. 
Thus the domain C of data components is defined inductively starting from the 
infinite and disjoint sets KS, KS, D and SH 

C = Ck\{C,C)\B b = k\d\sh k = ka\ks\k~^ 

For secret share, new pair of operations a and a' are also introduced which 
represent sharing a secret to participants and reconstructing participant share 
back to secret. 

s s' {(s, s') I ( 3 s", C, t, n. Cl, C2, . . . , Cn • s = C U s" A s' = S U Cl U C2 U 
. . . U Cn} At < n) 
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S s' {(s, 5 ') 1 ( 35 ", C, n, Cl, C 2 , . . . , Cn • s = Cl U C 2 U . . . U Cn U s" A s' = 
sU c} At <n) 

where n is number of shares, t is threshold of the sharing scheme, c is share 
where as ci, . . . c^ are the shares. 

Knowledge manipulation For the secret sharing in addition to the rules 
introduced by Dominique [1], some new set of rules are introduced. Some exten- 
sion of consistency and completeness rules are: 
card{csr) '>t Acgr known An s cknownJns 

card{csr) < t A Cgr known An s ^{c known An s) 

card{csr) A 1 Csr comp-of s c comp .of s 

Formalization of the protocol In the state model of Dealer^ a and p are 
secret share and support share for Players. The protocol is then modeled using 
the 8 basic actions. Such as 

action{sd-Cit^ Id, A s^ = siA A new{s^.ai^ Si) 

action{sp..at^ IPi^Sp.at) A {spC\) known An sj 

action{sd-at^ 2d, Sp. .at) A = SiU {spHash{ai, pi)) A new{s pHash{ai, pi)) 
where primed variables are used for representing values of state variables after 
the application of the operation and where by convention static variables that 
are not explicitly changed are supposed to be unchanged: the first action which 
only mentions new values for Sd-at and Si thus assumes by default that Sd = 
and so on. Predicate action is used to specify sequencing constraints, i.e. control 
structure for each role. Here we assume that D is repeating an infinite loop [Id, 
2d, 4d] and Atpi^ . . . ^ Atp^ are for example the respective control points before 
each of the three actions. In the same way Pi is supposed to repeat [Ip^, 2p^, 3p^, 
4p^]. The predicate action (at, act, at') can in this case be defined as the union 
of 8 relations (at = Atpi A act = Id A at' = Ato 2 ) "d • • • "d (at = Ato 4 ^ A act = 
4d A at' = At Di). 



2 Conclusions 

The use of general theorem proving verification methods to help the verification 
of cryptographic protocols looks promising. If a protocol has been analyzed using 
general theorem proving verification methods, one should however not be fooled 
into thinking that the protocol is one hundred percent correct. The verification 
model should be seen as a debugging tool, which explores potential attacks on 
the protocol in an automated manner. 
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A system of linear constraints is represented as a set of half-planes S = {(a^X + 
bjY^ Cj ^ 0) j = 1 . . . N}. Therefore, in the context of linear constraints two 
terms “set” and “system” can be used interchangeably whenever one of two is 
suitable. A set of linear constraints represents a convex polygon that is the in- 
tersection of all half-planes in the set. The convex polygon represented by S is 
called the feasible polygon of S. Such sets of linear constraints can be used as 
a new way of represent spatial data. These sets need to be manipulated effi- 
ciently and stored using minimal storage. It is natural to store only sets of linear 
constraints which are feasible and in irredundant format. Therefore, it is very 
important to find out if a given system is feasible and/or bounded and to find 
the minimal (irredundant) set of linear constraint which have the same feasible 
area with the given one. LASSEZ and MAHER (1988) have investigated algo- 
rithms to check if a system of linear constraints over multidimensional is 
feasible. LASSEZ et al (1989) have investigated algorithms to eliminate redun- 
dant constraints from a system of linear constraints over R^. Their algorithms 
are based on the Eourier variable elimination (similar with Gaussian elimina- 
tion in solving the linear system of equations) and therefore have the running 
time 0{N‘^) where N is the number of constraints, and as such it is not effi- 
cient. DYER (1984) and MEGIDDO (1983) have independent proposed linear 
time algorithms to solve the linear programming problem in 2- and 3-dimension 
cases. In this paper, we consider sets of linear constraints as a new way of rep- 
resenting geographic data and therefore only set of linear constraints over 
which is enough for our purpose. DYER’s and MEGIDDO ’s algorithms need 
to be investigated further for our purpose. Eeasibility of a set of linear con- 
straint is in fact one output of the algorithms by DYER (1984) and MEGIDDO 
(1983). We mention here the algorithm of checking feasibility of sets of linear con- 
straints for the completeness. We proposed here a linear time algorithm to check 
boundedness of sets of linear constraints. Boundedness of a set of linear con- 
straints S can be found by applying algorithm by DYER (1984) and MEGIDDO 
(1983) to determine four extremity points of the convex feasible polygon P of 
5: (i) (Xo 7 Wain) • Wain — min{y : 3Xo(Xo,Y) € P}, (ii) (Vi,F^ax) : = 

max {Y : 3Xi, (Xi, Y) € P}, (iii) (X^;n, Fq) : V^in = min {X : 3Yo{X, Fq) € P} 
and (iv) (XmaxVi) ^ V^ax = max{X : 3Fi(X, Fi) € P}. If S is infeasible or if 
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S is feasible and all of X^in and X^ax and l^rnm and l^max are finite then S is 
bounded. Otherwise S is unbounded. 

The algorithm by DYER (1984) and MEGIDDO (1983) have been proved to 
have the running time of 0{N). Therefore, the algorithms to check the feasibility 
and boundedness of a set of linear constraint over will have the running time 
of 0{N) where N is the number of linear constraints in S. 

In this paper, we also propose a linear time algorithm to eliminate the redun- 
dancy of a set of linear constraints provided set of linear constraints has been 
pre-sorted. A set of linear constraints S is said redundant if there exist a proper 
subset S' C S {S' C S and S' ^ S) so that feasible polygon of S' and feasible 
polygon of S are exactly the same. Given a set of linear constraint S over 
with feasible polygon P. Erom the convexity of polygon P, our algorithm of 
eliminating the redundancy of S is based on the following observations: 

• Suppose h and I 2 is two edges of a convex polygon P with slope ai,a 2 , 
respectively satisfying a\ < a 2 - Given line I with slope a satisfying a\ < a < 
tt 2 , saying I is an edge of P means that intersection of I with polygon P is 
not empty. Therefore I is an edge of P if and only if the intersection point 
of h and I 2 is external to the polygon P. If I is not an edge of P then I is 
redundant and can be eliminated. 

• The intersection point of any two adjacent edges of a convex polygon is a 
vertex of the polygon and therefore belongs to this polygon. Suppose h and I 2 
is two lines derived from two constraints of S with slope ai, a 2 , respectively 
satisfying a\ < a2 and there does not exist I from S so that slope a of I 
satisfying ai < a < a 2 - If the intersection point of h and I 2 does not belong 
to P then one of these two corresponding constraints are redundant and can 
be eliminated. 

With preprocessing when sets of linear constraints are sorted in running time 
of O(XlogX), our algorithm of eliminating the redundancy of a set of N linear 
constraints are of 0{N). 

These above algorithms can be applied to detect and compute intersection of 
two sets of linear constraints in linear time provided the sets of linear constraints 
have been pre-sorted: Given two sets of linear constraints S\ = {(aijX + biY + 
cij ^ 0) j = 1 . . . Xi} and S2 = {{a 2 jX + h 2 jY -\- C 2 j < 0) j = 1 . . . X 2 }. Two 
sets Si and S 2 are merged into one set S. In order to detect the intersection of 5i 
and 52, we just apply the algorithm to check feasibility of 5, if 5 is feasible then 
Si intersects 52- Otherwise Si does not intersects 52- In order to compute the 
intersection of 5i and 52, we apply the algorithm of eliminating the redundancy 
to 5. The result set will be the intersection of Si and 52- 

As shown above, with preprocessing when sets of linear constraints are sorted 
in running time of 0{N log X), our algorithms are of 0{N) where X = (X 1 +X 2 ). 

Using our algorithm in context of spatial data, given two convex polygons 
represented using two sets of linear constraints with numbers of constraints Xi 
and X 2 respectively, we can detect and compute the intersection of two given 
convex polygon in the running time of 0(X1 + X2) which is optimal provided 
the two sets of linear constraints have been pre-sorted. 
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Knowledge discovery in database (KDD for short) is fast growing discipline of 
informatics. It aims at finding new, yet unknown and potentially useful patterns 
in large databases. Association rules [1] are very popular form of knowledge ex- 
tracted from databases. The goal of KDD can be e.g. finding interesting patterns 
concerning dependency of quality of loans on various attributes which can be de- 
rived from a bank database. These attributes can be arranged e.g. in data matrix 
LOANS in Tab. 1. There are n loans, each loan corresponds to one row of data 
matrix. Each column corresponds to an attribute derived from the database. 
Column TOWN contains information about domicile of the owner of the loan, 
column AGE contains age of the owner (in years). Column LOAN describes the 
quality of the loan (OK or BAD). Let us emphasize that usually there are tens 
of attributes describing both static (TOWN, AGE) and dynamic characteristics 
(derived e.g. from the transactions) of owners of loans. 

An example of association rule with confidence 0.9 and support 0.1 is the 
expression ”TOWN[To%o] ^o.9,o.i LOAN[OiL]”. It means that (i) at least 90 
per-cent of loans owners of which live in Tokyo are OK and (ii) there is at least 
10 per-cent of loans such that their owners live in Tokyo. In other words it 
means that > 0.9 and > 0.1, where a,6, c,d are frequencies from 

the four- fold contingency table Tab. 2 of Boolean attributes TOWN[ToA;^o] and 
LOAN [OK] for data matrix LOANS. 



row 


TOWN AGE . . 


. LOAN 


ri 


Tokyo 


17 .. 


. OK 




Prague 


17 .. 


. BAD 



Tab. 1 - Data matrix LOANS 





LOAN [OK] 


-n LOAN [OK] 


TOWN[ToA;2/o] 


a 


b 


^ TOWN[Tokyo] 


c 


d 



Tab. 2 - Eour-fold contingency table 



The frequency a is the number of rows (loans) in data matrix LOANS satis- 
fying both Boolean attribute TOWN[Tokyo] (owner of the loan lives in Tokyo) 
and Boolean attribute LOAN [OiL] (loan is OK). The frequency b is the number 
of rows satisfying TOWN[ro%o] and not satisfying LOAN[OiL], etc. 

This poster deals with generalized association rules. Generalized association 
rule (GAR for short) is an expression of the form Lp ^ where ip and are 
Boolean attributes derived from the columns of the analysed data matrix. Data 
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matrix LOANS with derived Boolean attributes (p and 'ip is in Tab. 3. Examples 
of such attributes are p = TOWN[Tokyo, Prague] A AGE[17, 18, 19] and p) = 
TOWN[Berlin, Paris, London] V AGE[77j. Boolean attribute p is true in a row 
ri, iff it is satisfied for the row a value of TOWN is Tokyo or Prague, and 
value of AGE is 17, 18 or 19, analogously for other Boolean attributes. 

Symbol is called a 4FT quantifier. GAR p ^ corresponds to an associa- 
tion between p and 4FT quantifier is a name of a kind of this association. A 

value of GAR p \n d. given data matrix can be true or false. These value is 
computed using an associated function EA of 4FT quantifier EE is a {0, 1}- 
valued function defined for all four-fold contingency tables. It is applied to a 
four-fold contingency table Tab. A oi p and for the given data matrix. 



row 


TOWN AGE . . 


. LOAN 


T 




ri 


Tokyo 17 . . 


. OK 


1 


0 




Prague 17 . . 


. BAD 


0 


0 



Tab. 3 - Derived Boolean attributes 









T 


a 


b 


^p 


c 


d 



Tab. 4 - Four- fold table of p and 



Goals of this poster are: 

— Mention various classes {implication, double implication and equivalency) of 
4FT quantifiers. Each class contains both simple associations of Boolean 
attributes and tests of statistical hypotheses. 

— Introduce 4FT logical calculi formulae of which corresponds to GAR. 

— Argue for usefulness of correct deduction rules of 4FT calculi for KDD. 

— Show that there is a simple condition concerning Boolean attributes ppp,p' 

and which is satisfied iff the deduction rule is correct. This condition 

depends on the class of 4FT quantifiers. 
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Genetic Programming (GP) was used to generate robot control programs for 
an obstacle avoidance task [1]. The task was to control an autonomous mobile 
robot from a starting point to a target point in a simulated environment. The 
environment was filled with the obstacles which had several geometrical shapes. 
In order to improve the robustness of the program, each program was evalu- 
ated under many environments. As a result, the substantial processing time was 
required to evaluate the fitness of the population of the robot programs. 

To reduce the processing time, this present study introduced a parallel imple- 
mentation. In applying the parallel approach to the algorithm program by using 
a conventional coarse-grained model, the result achieved only linear speedup 
since the amount of work was fixed - the algorithm was terminated when it 
reached the maximum generation. Hence, the parallel algorithm did not exploit 
the probabilistic advantage that the answer may be obtained before the maxi- 
mum generation. 

We tried in this present study another method to further improve the speedup 
by dividing the environments among the processing nodes. After a specific num- 
ber of generations, every subpopulation was migrated between processors using 
a fully connected topology. The parallel algorithm was implemented on the dedi- 
cated cluster of PC workstations with 350 MHz Pentium II processors, each with 
32 Mb of RAM, and running Linux as an operating system. These machines were 
connected via 10 Mbs ethernet cabling. We extended the program used in [1] to 
run under the clustered computer by using MPI as a message passing library. 

In the first stage of the implementation, the migration was synchronized. The 
synchronizing migration resulted in uneven work loads among the processors. 
This was due to the fact that the robot performed the task until either the robot 
achieved the target point or reached an iteration limit. Hence, this migration 
scheme caused the evolution to wait for the slowest node. 

In the second stage of the implementation, we attempted to further improve 
the speedup of the parallel algorithm by the asynehronous migration. When the 
fastest node reached predetermined generation numbers, the migration request 
was sent to all subpopulations. Therefore, this scheme caused the evolution of 
all subpopulations to proceed according to the fastest node. 

The widely used performance evaluation of the parallel algorithm is the par- 
allel speedup. To make an adequate comparison between the serial algorithm 
and parallel algorithm, E. Cantu-Paz [2] suggested that the two must give the 
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same quality of the solution. In this paper, the quality of the solution is de- 
fined in terms of the robustness. The robustness of the generated programs from 
the parallel algorithm was demonstrated to be better than the serial algorithm. 
Consequently, the amount of work from the parallel algorithm in this experiment 
was not less than the serial algorithm. 

Figure 1 illustrates the speedup observed on the two implementations as a 
function of the number of processors used. Both implementations exhibit super- 
linear speedup. The speedup curves taper off for 10 processors and the perfor- 
mance of the asynchronous implementation is slightly better than the perfor- 
mance of the synchronous implementation. 




Number of Processors 



Fig. 1. Speedup 



After obtaining some timing analyses, the results reveal the cause of the prob- 
lem. The performance degradation in 10 processors is caused by the excessive 
communication time due to the broadcast function. Although the asynchronous 
migration reduces the barrier time effectively compared to the synchronous mi- 
gration, the increase in the broadcast time in 10 processors obliterates this ad- 
vantage. However, in case of a small number of processors (2,4,6), the reduction 
of the communication overhead from the asynchronous migration compared with 
the synchronous migration is considerable - i.e. the reduction in 2,4,6 nodes is 
96.09%, 84.44% and 62.42% respectively. In terms of the wall-clock time, the 
asynchronous implementation in this work using 10 nodes is 21 times faster 
than the serial algorithm. 
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Abstract. Verification of formal specifications for multimedia systems, 
with time taken into consideration, is still a subject of much research. 
Estelle, an internationally standardised Eormal Description Technique 
(FDT) based on an extended finite state machine model, does not have 
enough expressive power to describe timing behaviours of distributed 
multimedia systems. To address this limitation, we have developed Time- 
Estelle, an extended Estelle which is capable of doing so. We also have 
developed a methodology for verifying Time- Estelle specification, which 
includes time properties. The verification method [1] involves translating 
a Time-Estelle specification to Communicating Time Petri Nets, which 
can then be verified using the tool, ORIS. In this paper, we describe 
the results obtained by applying our method to a real-life multimedia 
protocol, the Reliable Adaptive Multicast Protocol (RAMP). 



1 The Verification Method 

The verification of a complex and large multimedia system requires partitioning 
the system into components of manageable size. A reusable and time-dependent 
formal model is necessary for managing such a system. The primary stage con- 
sists of a formal time and functional description of the systems. The functional 
description is provided by the standard Estelle specification. The time descrip- 
tion is provided by the extended Estelle specification. The intermediate stage 
focuses on translating Time-Estelle specification to Communicating Time Petri 
Nets (CmTPNs) [2]. In this stage, all the Time-Estelle module bodies and inter- 
action points in the module header of any corresponding module body become 
places in a CmTPN, and transitions in the module body become transitions in 
CmTPN. Tokens in a place denote the fact that the message and state belonging 
to the module instance will be processed by one of the enabling transitions in 
the module body. The time information of Time-Estelle modules and transitions 
become the time constraint in the transitions and arcs of nets. 

The final stage executes and analyses the system using the CmTPN based 
tool, ORIS. This approach highlights how the validation of CmTPN models 
relieves the state explosion problem by integration analysis. The integration 
analysis is the validation of a composed CmTPN system through the integration 
of the results of the unit analysis of its component modules. A merit of integration 
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analysis is its flexible management of state space explosion by permitting a 
systematic concealment of local events. 



2 Verification Results 

The verification exercises revealed that the Burst mode of RAMP [3] does not 
have any deadlock or live lock in all the normal cases when collision does not 
occur. However, in the case of collision and overdue delay but without any func- 
tional operations in the “unexpected” states, deadlocks but no livelocks have 
been uncovered. The deadlocks are set out below: 

a) A receiver issues the Ack message to the sender later than the expect time 
of the Sender, and the Receiver has changed to another state. The Sender has 
released a resent message simultaneously, but the Receiver does not functionally 
respond to the Resent message. As a result, a deadlock occurs. 

b) A receiver issues a resent message to the Sender later than the idle time 
of the current burst period, and the Sender has changed to next burst state. 
The Sender cannot functionally resent the data messages of previous burst data 
stream after time-out; this state is unreachable. As a result, a deadlock occurs. 

The following describes the results for verifying the Idle mode of RAMP. It is 
verified against the deadlock freeness property. The verification exercises revealed 
that there are several deadlocks when a collision or an unreliable situation occurs. 
The deadlocks are set out below: 

a) If the Receiver does not receive an Idle message or a Data message within an 
adaptive time-out interval from the last Data message, the Receiver issues the 
resent message with the next sequence number. However, in the case of collision, 
the Sender does not receive the resent message so that the Sender does not resent 
the message. The Receiver loses this message. As a result, a deadlock occurs in 
the transfer of this message. 

b) If after a time-out, the Receiver does not receive the missing message, the 
Receiver closes the control channel with the sender and this closes the RAMP 
flow to the Sender. However, the Sender does not receive the closed message 
from the Receiver so that the Sender continues sending the Idle message. As a 
result, a deadlock occurs in the Sender. 

c) Receivers periodically unicast Idle messages to the Sender. Any Receiver’s 
message to the Sender resets the Receiver Idle time-out period. If the Sender 
does not receive Idle message from a receiver, the Sender closes the connection 
to the receiver. However, the Sender loses the Idle messages periodically so that 
the Sender closes the connection also periodically. As a result, a deadlock occurs 
in the Sender. 
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Abstract. Current standards of ATM ean only support pt-pt (or unieast) 
eonneetions and unidireetional point-to-multipoint (pt-mpt) eonneetion and do 
not provide a sealable solution for truly multipoint-to-multipoint (mpt-mpt) 
eommunieation. The main reason is that AAL5 does not provide multiplexing 
identifieation on a per eell basis. Cells from different paekets on a single 
eonneetion eannot be interleaved. To preserve AAL5 strueture, additional 
meehanisms are needed at the merging point to differentiate paekets and 
prevent eell mixing. In our previous study [1], we have proposed a Fair 
Intelligent Congestion Control (FICC) for ABR point-to-point traffie. It was 
demonstrated that FICC is simple, robust, effieient, sealable and fair relative to 
other proposed eongestion eontrol algorithms. In this paper we propose to apply 
FICC together with simple queueing and seheduling meehanism to provide 
effieient, fair bandwidth alloeation and eongestion eontrol in a multipoint-to- 
point (mpt-pt) eonneetion for heterogeneous serviee with different data rates. 
The simulation results show that FICC preserves all the desirable point-to-point 
properties, and performs equally well in multipoint-to-point eonneetions. 



FICC for Mpt-pt ABR Service in ATM Network 

I. Description of the Fair Intelligent Congestion Control 

The Fair Intelligent Congestion Control [1] treats rate allocation mechanisms for 
noncongestion and for congestion period in a similar and consistent manner. It aims 
for a target operating point where the switch queue length is at an optimum level for 
good throughput and low delay, and where the rate allocation is optimum for each 
connection. In order to estimate the current traffic generation rate of the network and 
allocate it among connections fairly, a Mean Allowed Cell Rate per output queue is 
kept at the switch. An explicit rate is then calculated as a function of the Mean 
Allowed Cell Rate, and a queue control function. 

II. Queueing and Scheduling Mechanism of the Data 

The Queueing and Scheduling mechanism employed is the same mechanism proposed 
in [2], Essentially, the mechanism is implemented as follows: 

• To isolate cells from different source VCs and to prevent cells from different 
packets interleaving, a separate queue is employed to collect cells from a given 
source VC packet, cells from these queues are merged(packet-interleaving) into 
one single VC. 
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• To avoid cell-interleaving, Cells from each queue for the mpt-pt connection are 
scheduled for transmission on a per packet basis. A packet is fully transmitted 
before any cells from the next packet is transmitted. 

III. Switch Algorithm for Multipoint-to-Point ABR 

if received_backward_RM (ACR, ER, Cl) { 

Process this RM cell as in the unicast ABR algorithm, 
in our case FICC algorithm; 

Forward this RM cell to the corresponding link; 

Return a backward RM(ACR, ER=MER, CI=MCI) cell to the 
source ; } 

if received_f orward_RM (ACR, ER, Cl) { 

Process this RM cell as in the unicast ABR algorithm, 
in our case FICC; 

Update the congestion condition: MCI=CI; 

Update the Explicit Rate on the connection: MER=ER; 
Discard this RM cell;} 

Fig. 1. Switch Algorithm for Multipoint-to-Point ABR Service 

The modified algorithm employed in this paper differs from its original [2] in the 
place where the unicast ABR algorithm is replaced by our Fair Intelligent Congestion 
Control in both forward and backward directions 

IV. Simulation Results and Analysis 

Simulation for various configurations has been performed. However, due to space 
limit, only simulation results for mpt-pt connection for parking-lot configuration are 
presented. 




Fig. 2. FICC-Parking lot configuration, a) Unicast b) 4-to-l Multipoint connection 



Simulation results demonstrate that the Fair Intelligent Congestion Control scheme is 
effective in congestion control and allocate bandwidth fairly among VCs in mpt-pt 
connection. It preserves all its desirable properties from point-to-point application. 
These properties include simplicity in that it does not require per VC-accounting, 
robustness in that it does not depend critically to switch parameters, scalability in that 
the switch buffer requirements do not depend on the number of VC connections, 
fairness in that it can allocate bandwidth fairly among its connections. More 
importantly, FICC achieves high throughput and acceptable delay, and these two 
performance parameters can be specified in terms of target operating point. 
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